This is quite recent and an exploratory side project to explore classification of visually similar yet physical objects. The best example would be geodesic polyhedrons of different frequencies, which can trick visual - both color and depth systems as follows.
The surface geometry of the object could be estimated using a robot gripper which has a GelSight sensor. A CNN using MobileNet architecture was trained to classify the objectsfrom the raw GelSight data. It worked pretty good (~90% accuracy) on the test objects. Analyzing the false classifications indicated that the tactile data may not be perfect during all the grasps. We humans also gets confused in the same way occasionally, if we grab objects with just two fingers. We would then either proceed to close the fingers to make more contact surface area with the object or roll the object between our fingers as in the following video to classify it.
This would then give us more tactile data and increases our belief probability. This work involves exploring whether such a capability can improve tactile sensing for robots.
Much research has been done in tactile object recognition as well as in hand manipulation , . Unlike those approaches. this work explores on learning a finger movement repertoire, that could maximize the in-hand object recognition/localization capabilities.
The following video shows the prototype gripper classifying two test objects (geodesic spheres with hexagonal and triangular faces, that can be better felt by touch)
We can see that it falsely classifies objects once in a while.
A modular 3rd axis is inserted in between the finger and the gripper, which can rotate the object in hand.
The classification probabilities during this motion are averaged to get a more accurate estimate of the object.
(Note: it has been tested only with symmetric objects, which are easy to roll and is still an ongoing project)
More experimentation, primarily on the following would be explored as time permits.
- 3D reconstruction using techniques using ICP.
- Recursive Neural network to extract temporal data during rolling.