Finding Lanes Without Deep Learning

Image for post
Image for post

Today we are going to work together in a project to find lanes in images and videos using python. For the project, we will take a manual approach. Even though it’s true we can get better results using technologies such as Deep Learning, it is also important that we learn the concepts, how it works, the basics, so that when we build our advanced models we can apply the knowledge we already learned. Some steps we are presenting might also be required when using Deep Learning.

The steps we are going to take are the following:

  • Compute the camera calibration and resolve distortions.
  • Apply a perspective transform to rectify the binary image (“birds-eye view”).
  • Use color transforms, gradients, etc., to create a thresholded binary image.
  • Detect lane pixels and fit to find the lane boundary.
  • Determine the curvature of the lane and vehicle position with respect to the center.
  • Warp the detected lane boundaries back onto the original image.
  • Output visual display of the lane boundaries and numerical estimation of lane curvature and vehicle position.

All the code and explanation can be found in our Github.

Compute the camera calibration

Today’s cheap pinhole cameras introduce a lot of distortion to images. Two major distortions are radial distortion and tangential distortion.

Due to radial distortion, straight lines will appear curved. Its effect is more as we move away from the center of the image. For example, one image is shown below, where two edges of a chessboard are marked with red lines. But you can see that border is not a straight line and doesn’t match with the red line. All the expected straight lines are bulged out. Visit Distortion (optics) for more details.

Image for post
Image for post
Camera Distortion Example

To solve this problem we will use the OpenCV python library, and using sample images taken by the target camera to a chessboard. Why a chessboard? In a chessboard image, we can easily measure the distortion as we know how the object looks, and we can calculate the distance from the source points to the target points and use them to calculate the distortion coefficients we can then use to fix the image.

The next image shows an example of an output image from the came and the undistorted resulting image:

Image for post
Image for post
Fixing Camera Distortion

(All this magic happens in the file lib/camera.py), but how does it work? The process consists of 3 steps:

In this step, we identify the corners that define the chessboard grid, in case we cannot find the board, or that the board is incomplete we discard the sample image.

In this step we find the camera intrinsic and extrinsic parameters from several views of a calibration pattern, which we can then use to produce the resulting image.

In this final step we actually produce the resulting image by compensating for lens distortion based on the parameters detected during the calibration step.

Apply a perspective transform to rectify the binary image (“birds-eye view”).

The next step in the process is to change the perspective of the image, from the regular camera view mounted on the front of the car to a top view, also called “birds-eye view”. Here is how it looks like:

Image for post
Image for post
Unwarped Image

(All this magic happens in the file lib/image_processor.py)

This transformation is very simple, we take four points on the screen that we know and we translate those into the desire positions of the screen. Let’s review it more in detail using the example of the image above. In the picture we see a green shape which was drawn on top, this rectangle is using the four source points as the corners and it’s overlapping what it would be for the camera a regular straight road. The rectangle cuts around the center of the image which because of the perspective is where the street view would normally end to give place to the sky. Now we take those points and we move them to our desire position on the screen, which is transforming the green area in a rectangle, going from 0 to the height of the picture, here are the source and destination points we will use on our code:

Once the points are identified it’s as simple as using OpenCV to do its magic once more:

Use color transforms, gradients, etc., to create a thresholded binary image.

Now that we have the image in place, we need to start discarding all the irrelevant information from it and keep only the lines. For this we will apply a series of changes which we will detail next:

Convert the color image to greyscale

Do some minor but important enhancements by smoothing the image with Gaussian Blur and weighting the original image to the smoothed image together

Calculate the derivative of the color change function on the X-axis and apply a threshold to filter high-intensity color changes, which as we are using a greyscale, would be borders.

Now we calculate the directional derivatives over new thresholds

Next, we combine them into one gradient

This filter applies to the original image, where we try to get only those pixels which are yellowish/white (as road lines are)

HSL threshold on L layer and S layer

For this task it’s necessary to change color spaces, in particular, we will use the HSL color space as it has interesting characteristics over the images we use.

And finally, we combine all of it into a final image:

Our new image now looks something as follows:

Image for post
Image for post

Awesome! can you already see the lines forming there?

Detect lane pixels and fit to find the lane boundary.

Up until now, we were able to create an image that consists of an eye-bird view which contains only the lane characteristics (at least for the most part, we still have some noise). With this new image now we can start doing some calculations to transform the image into actual values we can use, like lane position, and curvature.

Let’s work on identifying the pixels on the image first and building a polynomial that represents the lane function. How are we planning on doing so? Turns out that there is a very clever method using a histogram of the bottom half of the image, here it’s an example of what the histogram would look like:

Image for post
Image for post
Histogram

The peaks on the image help us identify the left and right side of the lane. Here is how building the histogram looks on code:

But you may ask, why the bottom half only? well… the answer is that we want to focus only on the segments which are immediately next to the car, as the lane may take a curve which can affect our histogram. Once we find the position of the lanes closer to the car we can use a moving window approach to find the rest as we detail in the next picture:

Image for post
Image for post
Moving Window Processing Example

Here is what it looks like on code:

This process is very intensive, so when processing video there are a few things we can adjust, as we do not always need to start from zero, calculations made previously give us a window of where the lanes can be next, so it’s easier to find. All that is implemented in the final code on the repository, feel free to take a look.

Once we have all the windows, we can now just build the polynomial using all the identified points, each line (left and right) would be calculated independently as follows:

The number 2 represents a second order polynomial.

Determine the curvature of the lane and vehicle position with respect to the center.

Now we know where the lines are on the image, and we know the position of the car (at the center of the camera) we can do some interesting calculations to determine the curvature of the lane and the position of the car respect to the center of the lane.

The curvature of the lane is a simple calculation over the polynomial.

There is an important consideration though, for this step we can’t work in pixels, we need to find a way to convert pixels to meters, so we introduce 2 variables: _ym_per_pix and _xm_per_pix which are pre-defined values, we won’t go into much details about it, you can take this values are presented, if you want to find more, there are procedures to identifying this values using algorithms and camera information.

Very simple, calculate the position of the middle of the lane, and compare it to the center of the image, like this

All Done!

Now you have all the information you need, and the polynomial to represent the lane. Your final result should look like the following:

Image for post
Image for post

And sample video

Image for post
Image for post

Or maybe not exactly… Remember we wrapped the image to the eye-bird view? well you need to revert the effect to render the polynomial into the original image, but I leave that for your homework, or just check it out on my code.

Remember, as mentioned, all the code is available on Github.

Thanks!

Originally published at https://livecodestream.dev.

Written by

I’m an entrepreneur, developer, author, speaker, and doer of things. I write about JavaScript, Python, AI, and programming in general.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store