Making Rockets Hover with PID Controllers
I spend way too much time on YouTube… probably much more than I’d like to admit. For all of its shortcomings, YouTube has a lot of very interesting content from dedicated and passionate creators. Every so often I’ll run across a video that sparks an idea for something I want to try. One such video I recently came across was this SpaceX propulsion test of their Dragon 2 spacecraft.
What stands out to me in particular about this video is that the spacecraft uses its boosters to perfectly hover a few feet off the ground. At first glance, this might not seem like a big deal. However, to achieve this, the craft must fire its thrusters hard enough to accelerate towards the target while slowly reducing the thrust force until it exactly counteracts gravity and the rocket hovers in midair .
The rocket equation has been known for quite some time, so it’s possible to calculate the exact thrust schedule needed to make the rocket hover. However, it’s much easier (and more fun!) to let the computer do all the heavy lifting for us. Instead of doing the math by hand, we’ll use Control Theory to make the rocket hover.
To get a feel for things, we’ll create our own basic rocket simulation and implement a couple of controllers. Unlike my last post, this one requires a lot more code, so I’ll only give a brief sketch of the most important parts of the implementation . If you want something that actually works, or you’re curious what the final product looks like, you can find everything on my GitHub. As always, pull requests welcome :).
Playing God — Simulating Physics
Before we can control the rocket we need to first build the rocket. To do this, we’ll implement a very idealized physics simulation. We’ll treat our rocket as a particle with only the forces of gravity and thrust acting on it. Since there will be no lateral forces we can conveniently skip calculating any angular dynamics. We’ll also assume that the rocket has infinite fuel so we don’t have to deal with things like changing mass or fuel slosh.
To compute the instantaneous rocket dynamics, we first discretize time into intervals of size dt and take into account any forces acting on the rocket at the end of each interval. Once we’ve resolved the forces, we can update the acceleration, velocity, and position of the rocket.
There are two things to note here:
- Our rocket lives in a flat world so we’ll use 2D numpy arrays as 2D vectors for our simulation.
- Annoyingly, screen coordinate systems set the origin (0, 0) to be the top left of the screen, so increasing y moves things down not up. This means that in our world gravity is 9.8 m/s² and not -9.8 m/s².
Now that we have a rocket, the only reasonable thing to do is set the thrust to 100% and let that baby rip.
And just like that we’re off to the moon. It’s interesting to note that even though this is a very simple simulation with crude programmer graphics, the rocket taking off actually looks kind of realistic. Ok, maybe not that realistic.
If you don’t care that our simple physics system is slightly wrong, or simply hated Calc I in college, feel free to skip this section. However if, like me, Numerical Quadrature really does it for you, then read on.
When you think about it, what we’re really doing in
Rocket.update is integrating the acceleration and velocity over time. Actually, we’re using the Euler Method to approximate the integrals with rectangles of width dt. Since the acceleration is constant, our velocity computation will be exact. However, because of this constant acceleration, the velocity increases linearly over time, so the position approximation will be incorrect. If we think about the curves traced out in the Cartesian plane, the acceleration curve forms a rectangle with the x-axis, while the velocity curve forms a triangle. Intuitively, we should able to exactly compute the rectangle’s area by smaller rectangles, but no matter how hard we try we cannot exactly compute the triangle’s area and will end up overestimating it.
To make this explicit, let’s see what the error is between our approximated position and the actual position, which we can figure out using the kinematic equations. Assume that the rocket has a constant acceleration of 5m/s² and we use dt=.5. According to our approximation, after 5 seconds the rocket will be displaced by 68.75m, while the kinematic equations give 62.5m, so our approximation error is +6.25m.
It’s possible to get better integral approximations by using more sophisticated methods like Runge-Kutta. That being said, dt=.5 is unreasonably large. If instead we use something like dt=1/60 (which is what I’m using) we get an error of only ~.63m. So while our simulation is wrong, it’s not that wrong. Of course, as dt → 0, our approximation will get closer and closer to the actual value.
Getting a rocket to shoot off into space is cool and all, but wasn’t the point to make it hover? For that we’ll need some way to control the rocket, and what better way to do that with than with Control Theory. Control Theory is a deep and interesting field which I have neither the time, nor the expertise to explain. For now, we’ll use just enough to do some cool things with it.
If we sat down and tried to come up with a quick solution, the immediate thing that comes to mind is to just hit the gas until we reach our target then slam on the breaks once we’re there. This is actually one of the simplest controllers possible called an On-Off controller (or Bang-Bang controller, which in my opinion sounds 100x cooler). Implementing it is extremely simple.
In Control Theory terms, what we’re trying to do is move a process towards a desired setpoint. We can do this by minimizing the error (i.e. difference) between the setpoint and the measured sate of the process, known as the process variable. To put things into our own terms, the setpoint is the height at which we want the rocket hover at and the process variable is simply the rocket’s vertical position.
Since the On-Off controller is binary, it returns either ∞ if the error is positive or -∞ if it’s negative. We can use this signal to set the rocket’s thrust, but we first have to squash it through a sigmoid (💁♂️🦋 “is this ML?” ) and turn it into a percentage which the rocket can use. We finally have all the components needed to run our simulation end-to-end.
Now that we’ve imbued our rocket with some decision making abilities, let’s see how well it can hover.
Unfortunately, not too well. This is because we’re not taking into account the rocket’s inertia. When we cut off the thrust at the target point, the rocket will still have a high upwards velocity which causes it to keep going up for a while. Similarly, when the rocket’s coming down, its velocity due to gravity will be too high for the thrust force to immediately counteract.
What we need to do instead is to slowly decrease our acceleration to 0 as we get closer to the target. To do this, we have to thrust with just enough force to perfectly counterbalance gravity. Since the On-Off controller is binary, we can’t possibly do this , and we must instead use something more powerful. Enter the PID controller.
The PID controller is more sophisticated as it keeps track of the process’ state over time and tries to account for the inertia issues discussed. It does this by adjusting the response via proportional, integral, and derivative control. Let’s take a look at how each component plays a role in our case:
- Proportional control scales the rocket’s thrust based on the magnitude of the error. This makes sense intuitively: if the error is large and negative (i.e. the rocket is way below the target) we want to hit the thrust as hard as possible. On the other hand, if we’re close to the target we want to apply just enough thrust to get to the target but not overshoot it.
- Integral control is probably the least intuitive component to think about. It adjusts the rocket’s thrust based on the accumulated error over time which has the effect of driving the steady state error towards zero. For example, if the error keeps increasing over time, more thrust will be applied to counteract this. In my experiments, the integral component mattered the least in controlling the rocket hovering, so if you don’t understand exactly what it is, don’t worry, neither do I.
- Lastly, derivative control accounts for the error’s rate of change over time. This is really the term that makes the whole system work as it’s responsible for annealing the acceleration towards zero as we approach the target. Like the proportional control, this also has a very intuitive interpretation: if the error is changing rapidly, the thrust must also change rapidly.
It takes some thinking to fully wrap your head around the 3 different PID components. Fortunately, the implementation is pretty straightforward and is easily translated into code.
The kp, ki, and kd parameters are weighting constants applied to each of the 3 components. Because the proportional, integral, and derivative terms interact unintuitive ways, setting them appropriately is heavily task dependent and takes a bit of manual tuning. There are even methods which do this for you automatically .
When I was first implementing everything, I tried a bunch of different settings for the weights but nothing seemed to be working. The rocket would either wildly oscillate around the target, like the On-Off controller, or just shoot off into space, never returning again. After a while, I found that the integral constant was ridiculously high. A good rule of thumb, at least for our rocket situation, is to set the ki to be about 3–4 orders of magnitude smaller than kp and kd. In fact, starting out with ki=0 and kp=kd=1 will almost certainly give you good results. You can then play around with the parameters until the rocket trajectory looks good. After some tweaking I found that kp=1.0, ki=.0001, kd=2.3 work pretty well.
That’s a pretty smooth hoverin’ rocket.
I first heard about PID control in the context of some Reinforcement Learning experiments I was running at work. With the recent extreme hype of Machine Learning and Deep Learning, these “classic” methods are easy to overlook, but as we saw, we can get extremely good results with very little effort. It seems the thousands of hours people invested into researching and perfecting these methods have not gone to waste.
That being said, a great next step would be to see if a Reinforcement Learning algorithm could learn to control the rocket to hover at the target. While I’m certain this is doable, I wonder how long you’d need to train for and how good the final results would be.
Thanks for reading! Check out the GitHub for the full code.
May 7th, 2019
 For what it’s worth, the spacecraft seems to be suspended by cables in the exact target position, so all the craft must do is apply enough force to counteract gravity — a much simpler problem.
 The biggest chunk of code that I’m skipping over is the rendering code, since it’s pretty boring. Funny story, while I was writing all the code for this blog post, I ran into issues with PyGame on Mac. After spending like 3–4 hours trying out different rendering frameworks (tkinter, turtle, kivy, etc), I settled on graphics.py. It’s extremely simple, but turned out to have everything I needed.
 It’s possible to get even better results with the On-Off controller. For example, we can have a dead-band around the target so that the thrusters are turned off when the rocket is close to the target, or rapidly switched on/off to fake applying a certain thrust force. However, if we were to implement all of that, we’d find that what we’ve created is a rough approximation to the PID controller anyways.
 It should be possible to set learn these constants as well using ordinary least squares regression. It would be interesting to see how much easier or harder this is compared to the traditional methods and how the learned coefficients would differ.