For the longest time, teaching a machine how to think and learn has been considered the biggest technological challenge. Since the time of Alan Turing, who laid down the problem theoretically for the first time, there has been significant development towards this challenge. The amount of scientific research devoted to it has resulted in the birth of fields like artificial intelligence and machine learning.
As time passed, and this challenge was developed to significant progress, researchers took up another one. This time, it was to give machines the sense of sight. This blog will go through the fundamental problems associated with the field of computer vision and links to different research work done to solve them.
Most problems within computer vision currently deal with either detecting and classifying an object, or detecting motion of any kind. Most of the cases discussed here are interesting cases of motion detection along with a case that is a logical extension to the classification problem. We firstly explain classification and motion detection and further move on to the cases.
- Object Classification
Object classification is one of the most common problems for which computer vision models are
employed. It basically involves firstly detecting an object in a given frame and then classifying it into a given category. This can be a single picture or a video that is processed frame by frame. It is possible that there can be one, several, or no instances of the different categories of objects.
Using CNNs and deep learning methods for object classification have yielded some significantly impressive results. However, there are other traditional CV algorithms that can be used along with DL for hybrid algorithms that combine the best of both worlds. Nevertheless, DL stands unmatched at classification problems on its own.
2. Object size filters
After detecting an object, the need to get the size of the object can arise. Once an algorithm can get the size of objects, a filter can be further created to find objects of specific size. Using OpenCV and Python, an approach can be to find the pixel per metric ratio.
The pixel per metric ratio is found using a reference object. We know the dimensions of this object and can easily find them in the reference picture. Once this ratio is established, the size of other objects is found by using the coordinates of the objects.
3. Object Tracking
Detecting motion within a visual frame is crucial for different kinds of applications like security, animal research, traffic monitoring, and behavior analysis, to name a few. Therefore, it is of significant interest to computer vision researchers and practitioners. This paper takes you through the traditional computer vision methods for motion detection and where it stands today.
The following cases are different extensions to the fundamental problem of motion detection. Some extend traditional methods for their requirements or introduce new techniques altogether.
4. Tripwire Event
The tripwire event is a specific type of occurrence that involves motion tracking and uses video analytics technology. A virtual wire is drawn in a specific area covered in a video scene and if that line is crossed, alarm detection goes on.
This paper uses an algorithm called Continuous Adaptive MeanShift (CAMShift) for motion tracking and develops an application using OpenCV and C. The three stages of the application are to firstly detect motion in the video and track it to check if it passes the drawn wire or not.
5. Exits Event
Exit events are one of the simpler problems in computer vision. It basically involves figuring out when an object has exited a certain area of interest. Such a case could also be for an area that is a camera forbidden area, i.e. toilets or changing rooms. In the regular case, it is a simple case of object detection and establishing a boundary which when crossed is considered an exit.
However, in the case of camera forbidden areas, this paper suggests an approach based on monitoring people on the outside from the time spent inside and also catches any changes in transformations. This is done through cameras outside the area. Next, it uses a spatial transition-based event.
6. Loitering Event
Loitering means entering a specific area that is off-limits. Using computer vision, this paper gave an approach to classifying pedestrian activity areas. It basically involved generating enclosing ellipses and rectangles around a pedestrian and tracking the pedestrian to see the time they stay inside it. However, it does not calculate complex trajectories.
There is another research that suggests using spatiotemporal image processing for the approach. It focuses on the staying time of the person and after detecting that, classifies it into loitering or not loitering. There is no active motion tracking in this approach, however.
7. Leave Behind Event (full view)
Leaving behind an object/human is also one of the problems solved by computer vision algorithms. This paper suggests an approach to combine fusion vision and microwave radar information. It basically uses human face detection and captures through models that combine results from various networks.
The microwave radar in the algorithm detects moving objects. Results from both capture whether an object is left behind and if so, by whom. There is also this video that shows an ideal left-behind computer vision algorithm in action.
8. Flow violation
This problem involves tracking and finding whether a certain object(s) that is part of a stream of objects is going against the collective flow of the objects. One of the most prominent of such scenarios is a vehicle breaking flow of the traffic moving along on a road.
This paper gives two approaches to understand the flow of the traffic and then figure out if an object is breaking/going against it. A feature-based approach looks for corner features and identifies them as vehicle objects while the other approach is trained on a variety of classifiers to identify vehicles. The identified vehicles are tracked through the Kalman filter which is a motion tracker.
The papers referenced throughout the blog show that computer vision is one of the leading areas in modern computer science research. Every year, researchers achieve remarkable advancements and even introduce new problems for nascent researchers to take up. Now that computer vision is being integrated into consumer products as well, it is only some time before research interest in this field increases even further.
I love learning new technologies and sharing my learning experiences through my writings