HOME
BLOG/NEWS
BIO
PUBLICATIONS
arXiv Papers
G-Scholar Profile
SOFTWARE
CBLL
RESEARCH
TEACHING
WHAT's NEW
DjVu
LENET
MNIST OCR DATA
NORB DATASET
MUSIC
PHOTOS
HOBBIES
FUN STUFF
LINKS
CILVR
CDS
CS Dept
Courant
NYU
Lush
DjVu Plugin for Win/Mac
DjVu Plugin for Unix
Websites that I maintain
Lush
NIPS Online
DjVuZone
Any2DjVu
Bib2Web
DjVuLibre
 

Yann LeCun's Home Page

Welcome to Yann's home page.

Blog/News (Google+ Stream)

Contact Information

Yann LeCun,
Director of AI Research, Facebook
Founding Director of the NYU Center for Data Science
Silver Professor of Computer Science, Neural Science, and Electrical and Computer Engineering,
The Courant Institute of Mathematical Sciences,
Center for Neural Science, and
Electrical and Computer Engineering Department, NYU School of Engineering
New York University.
Room 1220, 715 Broadway, New York, NY 10003, USA.
(212)998-3283     yann [ a t ] cs.nyu.edu     yann [ a t ] fb.com
Administrative aide: Hong Tam (212)998-3374     hongtam [ a t ] cs.nyu.edu
Note: the best way to reach me is by email or through Hong (I don't check my voicemail very often).

My main research interests are Machine Learning, Computer Vision, Mobile Robotics, and Computational Neuroscience. I am also interested in Data Compression, Digital Libraries, the Physics of Computation, and all the applications of machine learning (Vision, Speech, Language, Document understanding, Data Mining, Bioinformatics).
News and Updates

Quick Links

Computational and Biological Learning Lab

My lab at the Courant Institute of New york University is called the Computational and Biological Learning Lab. It is part of VLG: the Vision-Learning-Graphics Group, an informal group of researcher interested in pixels (in analyzing them or in producing them).

See research projects descriptions, lab member pages, events, demos, datasets...

We are working on a class of learning systems called Energy-Based Models, and Deep Belief Networks. We are also working on convolutional nets for visual recognition , and a type of graphical models known as factor graphs.

We have projects in computer vision, object detection, object recognition, mobile robotics, bio-informatics, biological image analysis, medical signal processing, signal processing, and financial prediction,....

Teaching

Jump to my course page at NYU, and see course descriptions, slides, course material...

Talks and Tutorials

See, watch and hear talks and tutorial.

Pamphlets and opinions

Proposal for a new publishing model in Computer Science

Many computer Science researchers are complaining that our emphasis on highly selective conference publications, and our double-blind reviewing system stifles innovation and slow the rate of progress of Science and technology.

This pamphlet proposes a new publishing model based on an open repository and open (but anonymous) reviews which creates a "market" between papers and reviewing entities.

MORE INFORMATION >>>>>

Deep Learning

Animals and humans can learn to see, perceive, act, and communicate with an efficiency that no Machine Learning method can approach. The brains of humans and animals are "deep", in the sense that each action is the result of a long chain of synaptic communications (many layers of processing). We are currently researching efficient learning algorithms for such "deep architectures". We are currently concentrating on unsupervised learning algorithms that can be used to produce deep hierarchies of features for visual recognition. We surmise that understanding deep learning will not only enable us to build more intelligent machines, but will also help us understand human intelligence and the mechanisms of human learning.

MORE INFORMATION >>>>>.

Relational Regression

We are developing a new type of relational graphical models that can be applied to "structured regression problem". A prime example of structured regression problem is the prediction of house prices. The price of a house depends not only on the characteristics of the house, but also of the prices of similar houses in the neighborhood, or perhaps on hidden features of the neighborhood that influence them. Our relational regression model infers a hidden "desirability sruface" from which house prices are predicted.

MORE INFORMATION >>>>>.
Mobile Robotics

The purpose of the LAGR project, funded by the US government, is to design vision and learning algorithms to allow mobile robots to navigate in complex outdoors environment solely from camera input.

My Lab, collaboration with Net-Scale Technologies is one of 8 participants in the program (Applied Perception Inc., Georgia Tech, JPL, NIST, NYU/Net-Scale, SRI, U. Penn, Stanford).

Each LAGR team received identical copies of the LAGR robot, built be the CMU/NREC.

The government periodically runs competitions between the teams. The software from each team is loaded and run by the goverment team on their robot.

The robot is given the GPS coordinates of a goal to which it must drive as fast as possible. The terrain is unknown in advance. The robot is run three times through the test course.

The software can use the knowledge acquired during the early runs to improve the performance on the latter runs.

CLICK HERE FOR MORE INFORMATION, VIDEOS, PICTURES >>>>>.

Prior to the LAGR project, we worked on the DAVE project, an attempt to train a small mobile robot to drive autonomously in off-road environments by looking over the shoulder of a human operator.

CLICK HERE FOR INFORMATION ON THE DAVE PROJECT >>>>>.

Energy-Based Models

Energy-Based Models (EBMs) capture dependencies between variables by associating a scalar energy to each configuration of the variables. Inference consists in clamping the value of observed variables and finding configurations of the remaining variables that minimize the energy. Learning consists in finding an energy function in which observed configurations of the variables are given lower energies than unobserved ones. The EBM approach provides a common theoretical framework for many learning models, including traditional discriminative and generative approaches, as well as graph-transformer networks, conditional random fields, maximum margin Markov networks, and several manifold learning methods.

Probabilistic models must be properly normalized, which sometimes requires evaluating intractable integrals over the space of all possible variable configurations. Since EBMs have no requirement for proper normalization, this problem is naturally circumvented. EBMs can be viewed as a form of non-probabilistic factor graphs, and they provide considerably more flexibility in the design of architectures and training criteria than probabilistic approaches.

CLICK HERE FOR MORE INFORMATION, PICTURES, PAPERS >>>>>.

Invariant Object Recognition

Lenet7-NORB NORB

The recognition of generic object categories with invariance to pose, lighting, diverse backgrounds, and the presence of clutter is one of the major challenges of Computer Vision.

I am developing learning systems that can recognize generic object purely from their shape, independently of pose and lighting.

See

The NORB dataset for generic object recognition is available for download.

CLICK HERE FOR MORE INFORMATION, PICTURES, PAPERS >>>>>.

Lush: A Programming Language for Research

Tired of Matlab? Lush is an easy-to-learn, open-source object-oriented programming language designed for researchers, experimenters, and engineers working in large-scale numerical and graphic applications.

Lush combines three languages in one: a very simple to use, loosely-typed interpreted language, a strongly-typed compiled language with the same syntax, and the C language, which can be freely mixed with the other languages within a single source file, and even within a single function.

Lush has a library of over 14,000 functions and classes, some of which are simple interfaces to popular libraries: vector/matrix/tensor algebra, linear algebra (LAPACK, BLAS), numerical function (GSL), 2D and 3D graphics (X, SDL, OpenGL, OpenRM, PostScipt), image processing, computer vision (OpenCV), machine learning (gblearning, Torch), regular expressions, audio processing (ALSA), and video grabbing (Video4linux).

If you do research and development in signal processing, image processing, machine learning, computer vision, bio-informatics, data mining, statistics, or artificial intelligence, and feel limited by Matlab and other existing tools, Lush is for you. If you want a simple environment to experiment with graphics, video, and sound, Lush is for you. Lush is Free Software (GPL) and runs under GNU/Linux, Solaris, and Irix.

VISIT THE LUSH HOME PAGE >>>>

DjVu: The Document Format for Digital Libraries

My main research topic until I left AT&T was the DjVu project. DjVu is a document format, a set of compression methods and a software platform for distributing scanned and digitally produced documents on the Web. DjVu image files of scanned documents are typically 3-8 times smaller than PDF or TIFF-groupIV for bitonal and 5-10 times smaller than PDF or JPEG for color (at 300 DPI). DjVu versions of digitally produced documents are more compact and render much faster than the PDF or PostScript versions.

Hundreds of websites around the world are using DjVu for Web-based and CDROM-based document repositories and digital libraries.

Learning and Visual Perception

My main research interest is machine learning, particularly how it applies to perception, and more particularly to visual perception.

I am currently working on two architectures for gradient-based perceptual learning: graph transformer networks and convolutional networks.

Convolutional Nets are a special kind of neural net architecture designed to recognize images directly from pixel data. Convolutional Nets can be trained to detect, segment and recognize objects with excellent robustness to noise, and variations of position, scale, angle, and shape.

Have a look at the animated demonstrations of LeNet-5, a Convolutional Nets trained to recognize handwritten digit strings.

Convolutional nets and graph transformer networks are embedded in several high speed scanners used by banks to read checks. A system I helped develop reads an estimated 10 percent of all the checks written in the US.

Check out this page, and/or read this paper to learn more about Convolutional Nets and graph transformer networks.

MNIST Handwritten Digit Database

The MNIST database contains 60,000 training samples and 10,000 test samples of size-normalized handwritten digits. This database was derived from the original NIST databases.

MNIST is widely used by researchers as a benchmark for testing pattern recognition methods, and by students for class projects in pattern recognition, machine learning, and statistics.

Music and Hobbies

I have several interests beside my family (my wife and three sons) and my research:

  • Playing Music: particularly Jazz, Renaissance and Baroque music. A few MP3 and MIDI files of Renaissance music are available here.
  • Building and flying miniature flying contraptions: preferably battery powered, radio controled, and unconventional in their design.
  • Building robots: particularly Lego robots (before the days of the Lego Mindstorms)
  • Hacking various computing equipment: I have owned 5 computers between 1978 and 1992: SYM-1, OSI C2-4P, Commodore 64, Amiga 1000, Amiga 4000. then I lost interest in personal computing when the only thing you could get was a boring Wintel box. Then, Linux appeared and I came back to life.....
  • Sailing: I own two sport catamarans, a Nacra 5.8 and a Prindle 19. I also sail and race larger boats with friends.
  • Graphic Design: I designed the DjVu logo and much of the AT&T DjVu web site.
  • Reading European comics. Comics in certain European countries (France, Belgium, Italy, Spain) are considered a true art form ("le 8-ieme art"), and not just a business with products targeted at teenagers like on this side of the pond. Although I don't have a shred of evidence to support it, I claim to have the largest private collection of French-language comics in the Eastern US.
  • making bad puns in French, but I don't have much of an audience this side of the pond.
  • Sipping wine, particularly red, particularly French, particularly Bordeaux, particularly Saint-Julien.

Bib2Web: Automatic Creation of Publication Pages

Bib2Web No deep science here, but if you are looking for a simple/automatic way to make all your publications (digital or paper-based) available on your web page, visit Bib2Web.

Photos Galleries

Fun Stuff

Previous Life

My former group at AT&T (the Image Processing Research Department) and its ancestor (Larry Jackel's Adaptive Systems Research Department) made numerous contributions to Machine Learning, Image Compression, Pattern Recognition, Synthetic Persons (talking heads), and Neural-Net Hardware. Specific contributions not mentioned elsewhere on this site include the ever so popular Support Vector Machine, the PlayMail and Virt2Elle synthetic talking heads, the Net32K and ANNA neural net chips, and many others. Visit my former group's home page for more details.

Links

Links to interesting places on the web, friends' home pages, etc .





[HOME] [NEWS] [PUBLICATIONS]
[RESEARCH] [DOWNLOADS]
[LENET] [MUSIC] [PHOTOS]
[HOBBIES] [FUN] [LINKS]

Yann LeCun, Professor
The Courant Institute of Mathematical Sciences
Room 1221, 715 Broadway, New York, NY 10012, USA
tel: (212)998-3283

Copyright © 2000-2004 Yann LeCun.

Yann LeCun, Le Cun, LeNet, DjVu, convolutional neural networks, machine learning, computer vision, pattern recognition, document imaging, image compression, digital libraries,