-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path12-print-n-lines-clarg.Rmd
More file actions
172 lines (115 loc) · 6.61 KB
/
12-print-n-lines-clarg.Rmd
File metadata and controls
172 lines (115 loc) · 6.61 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
```{r, include = FALSE}
ottrpal::set_knitr_image_path()
```
# Applied Python Exercise 4
## Goal -- Print a specific number of lines from the beginnning of an input file, allowing the specification of the number of lines outside of the code by the user
Write Python code that works towards recreating the Bash tool `head`, displaying a user specified number of lines from the beginning of a user specified input file.
## Learning Objectives
After going through this chapter, students should be able to:
* State the sub-steps needed to meet the coding goal
* Use the following datatypes, structures, and fundamentals to meet the coding goal:
* importing modules
* variable assignment
* list indexing
* integer conversion
* opening a file
* for loop
* logical expression
* conditional statement
* printing output
## Coding Blueprint
Let's again start with and edit the pseudocode from the previous chapter to meet the needs of this chapter.
Previous chapter's pseudocode:
First, we need to SET the input file <br />
Next, we need to SET the desired number of displayed lines <br />
Then, FOR every line in the open file <br />
IF a desired line (by its numerical position) <br />
PRINT the line <br />
END IF <br />
END FOR <br />
Do we need to change this pseudocode at all or does it accurately reflect what we want to accomplish with our new script?
***
<details><summary> ANSWER: </summary>
No we don't need to change the pseudocode from the previous Applied Python Exercise chapter because it still accurately reflects the goal we want to accomplish within this chapter.
</details>
***
## Building the code
**Create a new Python script**
We may not need to edit the pseudocode for this chapter, but we will need to edit the code. The end users of tools don't have access to the values of variables within code unless the variables can be passed information through the command line. So rather than defining the variables for the input filename and desired number of lines within the script as hardcoded, inflexible variables with specific values, we're going to define the variables based on input arguments used when running the script, increasing the flexibility of the tool.
The `sys` module is useful for this in Python. Specifically, it allows you to use a structure called `sys.argv` which is a list that stores the command line arguments passed to a Python script. `sys.argv[0]` is always the script name with the additional information/arguments starting at index 1 or `sys.argv[1]`.
When you use `sys.argv` in a script, it's a good practice to include a USAGE statement comment at the beginning of the script showing how the script should be called. In our case, we want the user to pass the input filename as the first argument and the desired number of lines as the second: `#USAGE: python scriptname.py input_filename number_lines_to_display` (Note that calling the python interpreter is not included in the list since it's before the script name).
So in building our code, we want to make sure to import the `sys` module and set a USAGE comment statement. Within your new Python script, do these two things.
***
<details><summary> ANSWER: </summary>
```{python, eval = FALSE}
#USAGE: python scriptname.py input_filename number_lines_to_display
import sys
```
</details>
***
### SET the input filename
In the code for the past chapters, we've set the filename as a hardcoded string variable. Now you want to set it as the first command line argument passed to the program when calling the script. Do this within your Python script.
***
<details><summary> ANSWER: </summary>
```{python, eval = FALSE}
filename = sys.argv[1]
```
</details>
***
### SET the desired number of lines
In the code for the previous Applied Python Exercise chapter, we set the desired number of lines as a hardcoded integer variable. Now you want to set it as second command line argument passed to the program when calling the script. Do this within your Python script.
***
<details><summary> ANSWER: </summary>
```{python, eval = FALSE}
n_lines = sys.argv[2]
```
However, we need the variable to be an integer, and the values in the `sys.argv` list are always stored as strings, even if they are numbers. Therefore, edit the line within your Python script to convert the value that is being stored to an integer (if you hadn't already converted the string to an integer).
```{python, eval = FALSE}
n_lines = int(sys.argv[2])
```
</details>
***
Now that we've managed to set the input file and the desired number of lines to values that are specified by the user on the command line, we can reuse the rest of the code for the next three steps that we used in the previous Applied Python Exercise chapter.
### FOR every line in the open file
To loop through every line in the open file, use the `for` statement structure from the second and third Applied Python Exercise chapters. Write this line of code within your Python script.
***
<details><summary> ANSWER: </summary>
```{python, eval = FALSE}
for i, line in enumerate(open(filename)):
```
</details>
***
### IF a desired line (by its numerical position)
To ask if the line is one of the desired beginning lines, use the conditional statement structure from the third Applied Python Exercise chapter. Write this line of code within your Python script, indenting correctly under the `for` loop statement.
***
<details><summary> ANSWER: </summary>
```{python, eval = FALSE}
if i < n_lines:
```
</details>
***
### PRINT the line
Finally, reuse the code from the first three Applied Python Exercise chapters to print the line, adding this `print()` statement indented correctly under the conditional within your Python script.
***
<details><summary> ANSWER: </summary>
```{python, eval =FALSE}
print(line.strip('\r\n'))
```
</details>
***
Within your Python script, you should have the seven lines of code together which makes our complete intended goal code -- code that prints a user specified number of lines from the beginning of a user specified input file.
**Use the command line or the Run button in the online interface to run the Python script and look at the output.**
## Complete Intended Goal Code
***
<details><summary> ANSWER: </summary>
```{python, eval = FALSE}
#USAGE: python scriptname.py input_filename number_lines_to_display
import sys #import module
filename = sys.argv[1] # SET the input filename
n_lines = int(sys.argv[2]) # SET the desired number of lines
for i, line in enumerate(open(filename)): #FOR every line in the open file
if i < n_lines: #IF a desired line by its numerical position
print(line.strip('\r\n')) #PRINT the line
```
</details>
***