RoboBee_Simulation/roboBee_class_PD_and_LQR.py at rev2 · SchwartzCode/RoboBee_Simulation · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
"""
Author:     Jonathan Schwartz   [jonbs98@gmail.com]
Group:      GWU's Adaptive Devices and Microsystems Lab
Professor:  Doctor Gina Adam
Description:
    Robobee Simulator with a PD controller for stabilizing the robot and an LQR
    controller that directs the robot toward points in 3D space (while also keeping it
    adequately stable). The LQR function also includes the readSensors() and getAngularVel()
    functions, which simulate the robot reading data off of its 4 phototransistors and
    passes that data back to the robot. The robot uses this data to estimate its
    state, which is passed on to the control algorithm so the robot can make an
    informed decision as to its next move.

    This work was inspired by the work of Taylor S. Clawson and Doctors
    Silvia Ferrari and Robert Wood. More detail can be found here: https://ieeexplore.ieee.org/document/7798778
"""


import numpy as np
import matplotlib.pyplot as plt
import control
from random import seed, random

class roboBee(object):

    """  CONSTANTS & ROBOT SPECS   """
    B_w = 0.0002 #drag constant [Ns/m]
    Rw = 0.009 #distance from robot's Center of Mass to where its wings exert torques on its body [m]
    Jz = 0.45e-9 #Z Axis Rotational Moment of Inertia [kg*m^2]

    MASS = 0.08 #mass [g]
    g = 9.81 #gravity [m/s^2]
    dt = 1/120 #1/120 #time step in seconds; represents one step at 120 Hz

    LIFT_COEFFICIENT = 1.0
    last_sensor_readings = np.array([0.0, 0.0, 0.0, 0.0]).reshape(4,1)


    def updateState_PD_Control(self, state, dt):
        """
        This function will calculate the next state using the current state
        and the plant's physics (the plant is the transfer function from inputs to the
        Robobee system to its outputs). The PD contorller simply calculates a torque
        for the robot to generate during the next dt to drive the robot towards a
        state where theta and theta_dot are both 0 (i.e. the robot is hovering in
        space pointing straight upwards with no rotational velocity). It calculates
        this torque by multiplying a pre-specified constant and the robot's current
        angular position and another pre-specified constant and the robot's current
        angular velocity. These two terms are added together, and their sum is the
        torque the robot will generate during the next time step.

        This is what state space is like (where x_dot is derivative of x, which is the
        state vector, u is the input vector, and A and B are coefficient matrices):

                            x_dot = A*x + B*u


        ==== ARGUMENTS ====
        state = current state (4 double numpy 1D array)
            state[0] = theta (angle of rotation from intertial to global coords in 2D)
            state[1] = theta_dot (change in theta per unit time)
            state[2] = x axis position in global coordinates
            state[3] = x_dot (x axis velocity)
        dt = time step = 1/120 [seconds] (wings flap at 120 Hz)

        ==== RETURNS ====
        new_state = The state of the robot one time-step (after dt [seconds]) in the future
        torque_applied = torque the robot decides to generate with its wings based
                         on its current state, this is returned and stored in an array.
                         This array is then used as the 'output' training data for the
                         neural network that will be used to replicate the controller.
        """

        #These are 'inputs' because the torque controller is proportional to theta
        #and theta_dot (it's a PD controller if you've taken a controls course)
        #    In the future this will be elsewhere, just keeping it to evaluate
        #    This controller against the analytical one I

        u = np.zeros(4)
        u[0] = state[0]
        u[1] = state[1]


        A = np.zeros((4, 4))
        B = np.zeros((4, 4))
        #Derivative of angular positon is angular velocity
        A[0,1] = 1
        #Derivative of position is velocity
        A[2,3] = 1

        # V_x_dot terms
        A[3,0] = self.g*self.LIFT_COEFFICIENT
        A[3,3] = -self.B_w

        # Theta_dot term(s)
        A[1,3] = -self.Rw*self.B_w / self.Jz


        #Apply input torque (as of now this is just the torque generated by torque
        #controller that keeps robot upright)
        torque_constant_prop = 4e-7
        torque_constant_deriv = 0.7e-7
        B[1,0] = -torque_constant_prop / self.Jz
        B[1,1] = -torque_constant_deriv / self.Jz


        state_dot = A.dot(state) + B.dot(u)

        #applied torque is returned and stored as data to train a neural network
        torque_applied = B.dot(u)[1]

        new_state = state.copy() + state_dot.copy() * dt

        return new_state, torque_applied


    def updateState_LQR_Control(self, state, dt, state_desired, gains):
        """
        This function will calculate the next state using the current state
        and the plant's physics (the plant is the transfer function from inputs to the
        Robobee system to its outputs). The inputs generated are dependent on the robot's
        current state, desired final state, and the Q and R matrix values used when
        generating the gains for the Linear Quadratic Regulator (see LQR_gains() method
        function for more information on this).

        This is what state space is like (where x_dot is derivative of x, which is the
        state vector, u is the input vector, and A and B are coefficient matrices):

                            x_dot = A*x + B*i


        ==== ARGUMENTS ====
        state = current state (6 double numpy 1D array)
            state[0] = theta (angle of rotation from intertial to global coords in 2D)
            state[1] = theta_dot (change in theta per unit time)
            state[2] = x axis position in global coordinates
            state[3] = x_dot (x axis velocity)
            state[4] = z axis position in global coordinates
            state[5] = z_dot (z axis velocity)
        dt = time step = 1/120 [seconds] (wings flap at 120 Hz)
        state_desired = final state the robot is trying to get to (6 double numpy 1D array)
            * each state_desired value corresponds with a state value, and all are 0
              except for x and z position, which will be a set of coordinates
                +-> this means we want the robot to stop and hover at the specified x/z point
        gains = coefficients to generate inputs based on current state values,
                the Linear Quadratic Regulator calculated, for more information
                see the LQR_gains() function (6 double numpy 1D array)

        ==== RETURNS ====
        new_state = The state of the robot one time-step (after dt [seconds]) in the future
        state_dot_lat[1] = torque the robot decides to generate with its wings based
                           on its current state, this is returned and stored in an array.
                           This array is then used as the 'output' training data for the
                           neural network that will be used to replicate the controller.
        """


        A = np.zeros((4, 4))
        B = np.zeros(4).reshape(4,1)
        #Derivative of angular positon is angular velocity
        A[0,1] = 1
        #Derivative of position is velocity
        A[2,3] = 1

        # V_x_dot terms
        A[3,0] = self.g*self.LIFT_COEFFICIENT
        A[3,3] = -self.B_w / self.MASS

        # Theta_dot term(s)
        A[1,3] = -self.Rw*self.B_w / self.Jz

        #Coefficients for input matrix B
        B[1] = 1 / self.Jz


        state_dot_lat = (A - (B * gains)) * state[:4] + B * gains * state_desired[:4]


        """  ALTITUDE CONTROLLER
                All it does is adjust the lift force based on where the robot is
                is to its desired altitude
        """

        adjustment = 0.02
        if (state[5] > 0 and state[5] > (state_desired[4] - state[4])):
            state[5] -= adjustment
        elif (state[5] < 0 and state[5] < (state_desired[4] - state[4])):
            state[5] += adjustment
        else:
            self.LIFT_COEFFICIENT = 1 + (state_desired[4] - state[4])

        if (self.LIFT_COEFFICIENT > 1.5):
            self.LIFT_COEFFICIENT = 1.5
        elif (self.LIFT_COEFFICIENT < 0.5):
            self.LIFT_COEFFICIENT = 0.5

        # Calculate change in z_axis position and z_axis velocity based on lift coefficient and orientation of robot
        state_dot_alt = np.array([state[5], self.MASS*self.g*(self.LIFT_COEFFICIENT*np.cos(state[0]) - 1)]).reshape(2,1)

        # Combine change in state for lateral variables (x, x_dot, theta, theta_dot)
        # and altitude variables (z, z_dot)
        state_dot = np.vstack([state_dot_lat, state_dot_alt])

        # New state is calculated by taking last state, and multiplying the current
        # rate of state change by the time step between states
        new_state = state + state_dot*dt

        return new_state, state_dot_lat[1]


    def LQR_gains(self):
        """
        This function uses the Robot's physics and two matrices, Q and R, that the
        user can change to adjust controller performance. The Robot's A and B matrices
        (i.e. the plant physics of the system) as well as the Q and R matrices are passed
        to a function from the control library that uses these matrices (in this code, R
        is just a constant, but it can also be an identity matrix) to construct and solve
        an algebraic Riccati equation. This returns a set of gains to obtain the performance
        specified by the Q and R matrices, an array of eigen values for the closed
        loop system, and the solution to the Riccati equation.

        ==== RETURNS ====


        """
        A = np.zeros((4, 4))
        B = np.zeros(4).reshape(4,1)
        #Derivative of angular positon is angular velocity
        A[0,1] = 1
        #Derivative of position is velocity
        A[2,3] = 1

        # V_x_dot terms
        A[3,0] = self.g*self.LIFT_COEFFICIENT
        A[3,3] = -self.B_w / self.MASS

        # Theta_dot term(s)
        A[1,3] = -self.Rw*self.B_w / self.Jz

        #Note: There are no terms in the A matrix for V_z_dot because that is
        #   controlled by the altitude controller which is decoupled from this
        #   controller (the latitude controller)


        #Coefficients for input matrix B
        B[1] = 1 / self.Jz

        Q = np.zeros((4,4))
        """
        I was going to write out an explanation on how to choose proper Q and R matrix
        weights, but I think this paper does a great job:
        https://www.cds.caltech.edu/~murray/courses/cds110/wi06/lqr.pdf

        Section 1 explains the basics of how a LQR works, and section 2 gives some
        guidelines on selecting weights for both.
        """
        Q[0,0] = 100
        Q[1,1] = 1
        Q[2,2] = 100
        Q[3,3] = 0.1

        R = 5e15

        # If you're getting weird errors from this function call, try installing python
        # and all the libraries needed for this code in a conda environment using
        # anaconda prompt. That worked very well for me (the issue has to do with
        # numpy not always installing with something called mkl, that's all I know)
        gains, ricatti, eigs = control.lqr(A, B, Q, R)

        return gains


    def run_lqr(self, timesteps, verbose = False, plots = True):
        """
        This function drives the LQR solver by calling the updateState_LQR_Control
        function a certain number of times (or until the desired state is reached).
        It logs state, system inputs, and estimated state data at each time step.
        It generates a few plots using this data after it finishes simulating the
        Robobee, and then returns the state data and system input data in two
        separate arrays. This can be used as training data for a neural network
        that is being trained to replicate the LQR controller. The state data would
        serve as the input to such a network, and the torques generated would be
        the output.

        ==== ARGUMENTS ====
        timesteps = number of times the updateState_LQR_Control will be called before
                    stopping the simulation. Note that if the desired final state is
                    reached, the simulation will stop even if it hasn't run through
                    all of the time steps.
        verbose = if set to true, this function will periodically print state data
                  while simulation is running
        plots   = if set to true, this function will generate a few plots to outline
                  system performance during the simulation

        ==== RETURNS ====
        state_data  = state of the robot at each time step (training input for a NN)
        torque_data = torque the LQR function told the robot to generate at each
                      time step (training output for a NN)
        """

        print("Running Simulation with LQR controller...")

        state = np.zeros(6).reshape(6,1)
        state_desired = np.array([0.0, 0.0, 2, 0.0, 2, 0.0]).reshape(6,1)

        gains = self.LQR_gains()
        torque_gen = 0

        pointReached = False
        i = 0

        while i < timesteps and not pointReached:
            # if the sum of the differences of each state variable and its desired value
            # is less than 0.01, the simulation will stop as the robot has (more or less)
            # reached its desired state
            diff = sum( abs(state - state_desired) )
            if diff < 0.01:
                pointReached = True

            # Logging data at each time step
            if (i==0):
                state_data = np.vstack([ state, state_desired[2], state_desired[4] ])
                sensor_data = np.array([0.0, 0.0]).reshape(2,1)
                aVelEstimates = np.array([0.0, 0.0]).reshape(2,1)
            else:
                state_data = np.hstack([ state_data, np.vstack([state, state_desired[2], state_desired[4] ])  ])
                new_reading = self.readSensors(state[0])
                aVelEstimates = self.getAngularVel(new_reading, torque_gen)
                sensor_data = np.hstack([ sensor_data, aVelEstimates ])

            # If verbose is set to true, this conditional periodically prints state
            # data to the console
            if i%10 == 0 and verbose:
                print("State at time step", i, ":\t", state)
                if (i != 0):
                    print("A-Vel Estimates: ", aVelEstimates)

            estimated_state = state.copy()
            estimated_state[1] = aVelEstimates[0]


            state, torque_gen = self.updateState_LQR_Control(estimated_state, self.dt, state_desired, gains)

            if (i==0):
                torque_data = np.array(torque_gen)
            else:
                torque_data = np.append(torque_data, torque_gen)

            i += 1

        state_data = np.array(state_data)

        # Create time array to plot state data against
        t = np.linspace(0, self.dt*state_data.shape[1], state_data.shape[1])

        if plots:
            plt.figure(figsize=[8,6])
            plt.title("Comparing Actual and Estimated Angular Velocity")
            plt.plot(t, sensor_data[0,:], label="Estimates from Sensors")
            plt.plot(t, state_data[1,:], label="Actual Angular Velocity Values")
            plt.plot(t, state_data[0,:], label="Actual Theta Value")
            plt.ylabel("Rotational Velocity [rad/sec]")
            plt.xlabel("Time [sec]")
            plt.xlim(0,1)
            #plt.ylim(-2,2)
            plt.legend()
            plt.show()

            plt.figure(figsize=[9,7])
            plt.suptitle("LQR Controller - Position (Desired Position x=%4.2f, y=%4.2f)" % (state_desired[2], state_desired[4]))
            #plt.suptitle("LQR Controller (R = 10)")
            plt.subplot(1,2,1)
            plt.plot(state_data[2,:], state_data[4,:])
            plt.ylabel('Y [m]')
            plt.xlabel('X [m]')
            #plt.plot(t, state_data[2,:])
            #plt.ylabel('X [m]')
            #plt.xlabel("t [sec]")
            plt.grid()


            plt.subplot(1,2,2)
            plt.plot(t, state_data[0,:], label='Theta  [rad]')
            plt.plot(t, state_data[1,:], label='Omega (Theta Dot)  [rad/sec]')
            plt.xlim(0,1) #angle usually congeres within first 100 time steps of simulation
            #plt.ylim(-0.5,0.5)
            plt.xlabel("Time [sec]")
            plt.ylabel("Magnitude")
            plt.legend()
            plt.show()

        print("Done!")
        if i == timesteps:
            print("Destination not reached in", i, "time steps.")
        else:
            print("Destination reached in", i, "time steps.")

        return np.transpose(state_data), torque_data


    def run_pd(self, timesteps, verbose = False, plots = True):
        """
        This function drives the PD controller by calling the updateState_PD_Control
        function a number of times equal to the timesteps argument. It logs state,
        system inputs, and estimated state data at each time step. It generates a few
        plots using this data after it finishes simulating the Robobee, and then returns
        the state data and system input data in two separate arrays. This can be used
        as training data for a neural network that is being trained to replicate the
        LQR controller. The state data would as the input to such a network, and the
        torques generated would be the output.

        ==== ARGUMENTS ====
        timesteps = number of times the updateState_LQR_Control will be called before
                    stopping the simulation. Note that if the desired final state is
                    reached, the simulation will stop even if it hasn't run through
                    all of the time steps.
        verbose = if set to true, this function will periodically print state data
                  while simulation is running
        plots   = if set to true, this function will generate a few plots to outline
                  system performance during the simulation

        ==== RETURNS ====
        state_data  = state of the robot at each time step (training input for a NN)
        torques_data = torque the LQR function told the robot to generate at each
                      time step (training output for a NN)
        """
        print("Running simulation with PD controller...")
        seed(0) #initializes random number generator
        state = np.zeros(4)

        state[1] = -10 + (random() * 20)

        for i in range(timesteps):
            if state[0] > 0.176:
                print("WARNING: Robot has rotated so much that the error of the small angle approximation has exceeded 1%.")
                print("In a real experiment, this would likely result in the robot losing control and crashing.")
                print("Current state: ", state)

            if verbose:
                print("State at time step ", i, ":\t", state)

            if(i % 250 == 0):
                #this conditional occasionally varies angular vel to validate functionality
                #of torque controller
                state[1] = -10 + (random() * 20)

            if (i==0):
                state_data = np.array(state)
            else:
                state_data = np.vstack([state_data, np.array(state)])

            state, torque_applied = self.updateState_PD_Control(state.copy(), self.dt)


            if (i==0):
                torques_data = np.array(torque_applied)
            else:
                torques_data = np.append(torques_data, torque_applied)

        t = np.linspace(0, self.dt*len(state_data[:,0]), len(state_data[:,0]))

        if plots:
            plt.plot(t, state_data[:,3], label='Velocity [m/s]')
            plt.plot(t, state_data[:,1], label='Angular Velocity [rad/s]')
            plt.plot(t, state_data[:,0], label='Angular Position [rad]')
            plt.grid()
            plt.legend()
            #plt.ylim(-10, 10)
            plt.ylabel("Magnitude")
            plt.xlabel("time [sec]")
            plt.title("State Space - PD Controller")
            plt.show()


        print("Done!")
        return state_data, torques_data


    def readSensors(self, theta):
        """
        This function provides a crude estimation for what each of the robot's four
        phototransistors would output given the robot's current state. The thought behind
        this estimation was to use the light output of a typical lightbulb (seems to
        be around 850 lumens) then to have the brightness adjust proportionally
        to the angle between the vector from the robot to the light and the vector
        normal to the sensor's surface.

        The reading is in lux (illuminance per square meter, and I'm saying the
        light is 1 meter away, meaning the light's luminance is spread
        over a sphere with surface area of 4*pi*[1]^2 = pi).

        A lot of the ideas for these 'estimations' hinge on the assumptions in sections
        4 and 4.4 of this paper titled 'Controlling free flight of a robotic fly using an
        onboard vision sensor inspired by insect ocelli' by S.B. Fuller et al. :
            https://royalsocietypublishing.org/doi/full/10.1098/rsif.2014.0281#d3e1883

        ==== ARGUMENTS ====
        theta = current angle of rotation of robot body relative to pointing straight up

        ==== RETURNS ====
        sensor_readings = array of 4 doubles containing the predicted output of each of the
                          four phototransistors given the robot's current state
        """


        # Normalized vector from robot to light source. Check out the assumptions in the
        # paper mentioned above for an in-depth explanation as to how it's okay to use
        # something this simple for this vector
        light_vec = [0,1,0]

        light_output = 850 #output of light in lumens, typical bulbs give off 600~1200ish lumens
        init_angle = 30 * np.pi / 180 #angle of normal vector of sensor face relative to horizontal
                                      # when robot is in starting position (pointing straight up)

        new_sensor_orientations = np.array([ [np.cos(init_angle - theta),       np.sin(init_angle - theta),       0],
                                         [np.sin(init_angle)*np.sin(theta), np.sin(init_angle)*np.cos(theta), np.cos(init_angle)],
                                         [-np.cos(init_angle + theta),      np.sin(init_angle + theta),       0],
                                         [np.sin(init_angle)*np.sin(theta), np.sin(init_angle)*np.cos(theta), -np.cos(init_angle)] ])


        sensor_readings = np.array([0.0, 0.0, 0.0, 0.0]).reshape(4,1)

        for i in range(new_sensor_orientations.shape[0]):
            # calculate the angle between vector normal to this phototransistor's face
            #and the vector from the robot to the lightsource using definition of dot product
            angle = np.arccos(np.dot(light_vec, new_sensor_orientations[i]) /
                                np.linalg.norm(light_vec) * np.linalg.norm(new_sensor_orientations[i]))

            # assume sensor output is proportional to the angle calculated in the step above
            sensor_readings[i] = light_output * angle


        return sensor_readings


    def getAngularVel(self, new_readings, torque_gen):
        """
        This function reads in the current sensor readings, and uses those alongside
        the sensor readings of the last time step in order to estimate the robot's
        current angular velocity. The last torque applied is also considered, as without
        taking that into consideration the estimation algorithm is not adequate.

        For a derivation of the math used in this function, check out section 4 of 'Controlling free flight
        of a robotic fly using an onboard vision sensor inspired by insect ocelli' by
        S.B. Fuller et al. :
            https://royalsocietypublishing.org/doi/full/10.1098/rsif.2014.0281#d3e1883

        ==== ARGUMENTS ====
        new_readings = Sensor readings generated by the readSensors() function during
                       the current time step
        torque_gen   = torque generated in the updateState_LQR_Control() function during
                       the current time step

        ==== RETURNS ====
        angular_vel_estimates = estimated angular velocities about Y and X axes (in
                                that order, X axis angular velocity is just about always 0)
        """

        diffs = new_readings - self.last_sensor_readings
        self.last_sensor_readings = new_readings

        # Factor to convert from angular position to sensor readings
        # Original estimate was pi/850, added the 7468.8 to scale from obtained
        # values to desired values
        k = np.pi / (850) * 7468.8

        # Matrix to convert change in sensor readings (d[SR]) over time to angular velocity (w)
        # such that:    w = L*d[SR]
        L = np.array([ [np.sqrt(3)/k,   0,  -np.sqrt(3)/k,    0,  ],
                       [0,  -np.sqrt(3)/k,  0,  np.sqrt(3)/k]       ])

        angular_vel_estimates = L.dot(diffs)

        angular_vel_estimates[0,0] += self.dt*torque_gen

        print(angular_vel_estimates)


        return angular_vel_estimates