Detection of semi-transparent objects is a challenging problem in the field of computer vision, especially in generalised settings. Existing solutions tackle this problem, either in a controlled environment or using specialized equipment. The goal of this project was to perform the detection in generalized environments (it should work in any new environment/surroundings) without the use of any special equipment, e.g. time of flight cameras, X-ray tomography.
The biggest challenge to achieve this goal was the unavailability of large amounts of data required to train the convolutional neural networks in generalized settings. This challenge was tackled by producing images of semi-transparent objects using two synthetic means, i.e. computer graphics and generative adversarial networks.
More than 200,000 synthetic images were generated in 54 different environments along with 12,000 real-world images in 13 different environments. Overall number of semi-transparent objects (in this case, drinking glasses) was 25. Some examples of synthetically generated images are shown below.
Different combinations of real-world and synthetic images were tried on 5 different CNNs trained over the span of several days on high performance computing systems.
The best detection accuracy on unknown test environments was close to 80%, whereas if the test environment is known beforehand, performing transfer learning on a few images yielded the accuracy of 92%.
These are excellent results considering the challenging nature of semi-transparent objects, which can be rendered invisible even to the human eye from certain angles and light reflections.
Our vision is to lead the way in the age of Artificial Intelligence, fostering innovation through cutting-edge research and modern solutions.