In this article i will use two real world examples in order to illustrate how the proper selection of time steps will influence the result of a numerical simulation. I got the idea for this article after finishing the magnetic pendulum simulation. The problem solved there is chaotic and i always wanted to know how reliable the result is and how different it would have been looked like had i choosen to use another integration scheme. Moreover i wanted to know if it is possible to find points in such a simulation where from a set of different integrators every single one of them converges to a different magnet. This article will answer both questions and in addition it will shed some light on the importance of choosing the right integration scheme and time step size with focus on the balance between accuracy and calculation speed.
In order to understand this article I recommend basic knowledge of numerical integration schemes. Having heard about Euler, RungeKutta and AdamsBashforth integration schemes would certainly help since i will not explain them in detail here. If you do not know about RungeKutta I recommend the following article:
The accuracy of an integration scheme is described by the local and the global error. The local error is the error made at a single integration step. The global error is the sum of local errors for many integration steps. Usually the big O notation is used to describe the local and global error of an integration scheme. The maximum global error of an integration scheme of order p (for small step sizes) is a function proportional to h^p or in big O notation: O(h^p). The general rule is: The higher the order p, the better is the integration scheme.
The ultimate goal of every simulation is getting accurate results in a reasonable amount of time. You have the choice of the integration scheme and once you select it you have to use it with a suitable timestep. The general Problem is that an accurate high order scheme will require more evaluations making it slower but on the other side it will work with larger step sizes making it faster. So your choice is either using an accurate scheme with large time steps or an not so accurate scheme with smaller time steps. But be carefull: One of the misconceptions is that if you choose your time step small enough even a bad integration scheme like Euler will produce the right result. Thats wrong! To find out why lets have a closer look at the global error of a numerical scheme. The following properties of the integration scheme are directly linked to the time step size:
Image 1 now shows the total error as the sum of roundoff error and method error plotted on a logarithmic scale. As you can see there is a step width for which the global error has a minimum. Every numerical scheme has an optimal time step size for which the total error is minimized. The consequence is that if you select your time step size too small your result will get worse not better! So much for the theory but can we actually observe this in a example? Lets find that out...
The first thing you should do with your integration engine is checking it against a known mathematical model for verification. In order to do this you need a problem with a problem that has a known analytic solution. I choose the damped harmonic oscillator as the verification model. The behaviour of such a system can be described by the following homogeneous 2nd order differential equation:
With:
m  Mass of the pendulum r  Coefficient of friction k  Spring constant. We assume the following initial conditions: The result of this differential equation is a function describing the position of the oscilator as a function of time: with: 

The difference between analytic solution (see Image 3) and simulated solution is the total error. The relation between the time step size and the total error for different integration methods is shown in image 4.
Image 3: Comparison between the analytic and simulated solution for a time step size of h=0.005. All methods except Euler's method match the solution pretty well. 
Image 4: Integration step size plotted against the square sum of total error for different integration schemes. 
Obviously the theoretical relationship between the total error and the time step size is confirmed. You can clearly see there is an optimal time step size for which the total error is minimal. You can also see that the total error is indeed smaller for the high order schemes with RK5 clearly showing the lowest error of all integration methods. Something you will also notice is that the higher the order of the integration scheme the larger is the step size that produces the best result. If you look at the Euler scheme something else is noteable: Despite being able to use step sizes smaller than in any other method the total error is still significantly larger in comparison to any other scheme, even if they use time step sizes of a order of magnitude larger. And if you look at Image 3 you will see that Euler just doesn't get it right even with a moderate step size. So what do we learn from that? How about that:
Lets have a closer look at the timing of the simulation cause after all what one really wants is getting an accurate result without spending too much time. So lets look at image 5 which illustrates the dependency between time step size and calculation time. The basic (and obvious) rule for all integration schemes is: The larger the timestep the faster the calculation. If you look closely at image 5 you will see a perfect set of straight lines, except for well they arn't really straight because i had to apply logarithmic scaling to both axis in order to get a meaningful diagram. So it's no linear relation between time step size and calculation time its a power law. If you cut the time step size in half you double the number of points in the calculation. The bottom line is that you really pay for small timesteps regardless which integration method you use! So you'd better choose wise...
So let's choose wisely and analyse yet another diagram (image 6). This time we see the calculation time vs. the global error squared. This is the point were it's getting interesting because now the relation between accuracy and the calculation time becomes obvious. So how long does any of the schemes need to produce a result with a given accuracy? The answer is right here in image 6:
Image 5: Integration step size vs. calculation time. Unsuprisingly the calculation time decreases over the the stepsize. 
Image 6: Calculation time vs. sum of error squares (global error). 
So what do we make of it? A table! At least thats what i make of it. Lets check how long it takes to solve our differential equation and lets say we do not want the square sum of the error to exceed 1e5. Which scheme does it quickest?
Rank  Integration Scheme  Calculation time in ms  Time step width 

1  RK 5  0.01  0.02 
2  RK 4  0.015  0.01 
3  ADB 5 (* see below)  0.03  0.005 
4  ADB 4  0.039  0.0035 
5  ADB 3  0.09  0.0015 
6  Euler mod. Heun  0.5  0.0003 
7  RK3 ADB2  0.7  0.0003 
9  Euler  36812  4e9 
The Winner is the Runge Kutte 5th order scheme! In this test it performs better than any other scheme because it allows using larger timesteps without loosing too much accuracy. A strong second goes to the RK4 scheme which is still better than the 5th order AdamsBashforth method. ADB 5 is a bit of a disappointment here cause as a multistep method it had all to win this race. The Calculations needed for each timesteps are minimal and yet it came in at the third place. The reason for this result is very simple: The calculations needed for calculating the forces for a damped harmonic oszillator are too simple. Multistep schemes can't play out their advantage over the single Step schemes if there is almost nothing to calculate. This results would change when dealing with more complex calculations such as 3body problems or an nbody problem. So don't write multistep schemes off based on these results! Also disappointing is the performance of the RK3 scheme which fell behind the modified Euler and Heun methods and came in tied with ADB2. The reason is simple: It needs more calculations but fails to allow much larger timesteps.
The next test will deal with a magnetic pendulum simulation. The basic principles of the magnetic pendulum are explained in Video 1. The model used in this test is shown in Image 5. The simluation was solved with different integration schemes for a lot of different time step sizes. The results will then be compared and analysed with regard to time step sizes and accuracy.
Video 1: Explanation of the magnetic pendulum ; © 2007 Aldo Cavini Benedetti 
Image 7: The second test used a Magnetic pendulum model with 10 Magnets. 
But before we do so lets have a look at two sample points. The movement of the pendulum is chaotic but chaotic behavior doesnt mean you can't calculate it, it just means that the calculation error increases exponentially with time. So you can't calculate very much into the future because eventually even small variations in the initial conditions will grow and produce results that are totally different. The next test will investigate traces produced by different integration schemes. For this test the initial conditions of all integration schemes were identic so any error in the traces is the direct result of inaccuracies of the integration scheme. As you can see in image 8 There are some points in the simulation domain with virtually no difference between the edifferent results whilst there are other points were virtually every scheme produces a different result (Image 9).
Image 8: There are starting points for which all integration schemes provide almost identic results. 
Image 9: Other starting points yield different results for almost all integration schemes. 
With this background it's interesting to see how a complete calculation will look like. First lets start with the resulting image. The following images were calculated with a time step size of h=0.015 s.
Large parts of the simulation domain show a chaotic behaviour, the central part and some structures appear to be reproduceable in all results. But in principle the good news is that all solutions look remarkably similar! So maybe we should have a closer look how similar they are:
Image 11: Greyscale image in which the value of a pixel indicates how many different results were obtained. White means all 10 integration schemes produced the same result, black means all produced different results. Pixelcolors were taken from the results of the RK5 scheme. 
Image 12: The resulting image with pixels only shown if 9 out of 10 integration schemes produced the same result. 