Skip to content

Commit 484a2fb

Browse files
2 parents 3ddd9dc + 6ed24f3 commit 484a2fb

2 files changed

Lines changed: 132 additions & 7 deletions

File tree

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "962e8194",
6+
"metadata": {},
7+
"source": [
8+
"## File types and git: text versus binary\n",
9+
"\n",
10+
"Version control systems oriented towards software development and programming are typically focused on **text-based files**: files where the contents are viewable on your computer as human-readable text. **Binary files,** on the other hand, are organized and saved with bits (`0`'s and `1`'s) and are not human-readable. Although this may be a simplified description in terms of the way computers store information (you can read more [here](https://en.wikipedia.org/wiki/Binary_file)), it is enough for our purposes to recognize that text-based files are best suited for use with version control system; in other words, your Python code!\n",
11+
"\n",
12+
"* Examples of common text-based file extensions are: `txt`, `md`, `csv`, `ipynb`, `py`, `html`, etc.\n",
13+
"* Examples of common binary files are: `pdf`, `ppt`, `xlsx`, `docx`, etc.\n",
14+
"\n",
15+
"```{admonition} Try it!\n",
16+
"Try exploring a few files on your computer to confirm wether they are text-based or binary by opening them up in a text editor. You will easily be able to distinguish the difference because one is readable, the other not.\n",
17+
"\n",
18+
"Note that in Windows if you are using Notepad (the default), you will want to select \"Word Wrap\" under the \"Format\" menu to fit the contents of very long lines within the visible width of the window.\n",
19+
"```\n",
20+
"\n",
21+
"## Jupyter Notebooks: JSON-format\n",
22+
"\n",
23+
"Jupyter notebooks, `ipynb`, are a special case in the discussion text vs binary. Because while the contents of your Markdown and code cells is saved as text in the file, the output of the code cells is sometimes a binary format. For example, if you create a plot using matplotlib and save the notebook, that plot output will be binary. This unfortunately makes it a little more difficult to use notebooks with version control, but if we are aware of the issue, it is not a problem---we will show you how.\n",
24+
"\n",
25+
"If you tried opening up a notebook in the text editor you would have noticed a structure with curly braces, `{}`. This is [JSON-format](https://en.wikipedia.org/wiki/JSON)] (another file type), which the notebook uses to store information in each cell.\n",
26+
"\n",
27+
"For example, the following two cells in a jupyter notebook:"
28+
]
29+
},
30+
{
31+
"cell_type": "code",
32+
"execution_count": null,
33+
"id": "6f894e15",
34+
"metadata": {},
35+
"outputs": [],
36+
"source": [
37+
"import numpy as np"
38+
]
39+
},
40+
{
41+
"cell_type": "code",
42+
"execution_count": null,
43+
"id": "c8cea055",
44+
"metadata": {},
45+
"outputs": [],
46+
"source": [
47+
"np.linspace(0, 1, 10)"
48+
]
49+
},
50+
{
51+
"cell_type": "markdown",
52+
"id": "8c741b3c",
53+
"metadata": {},
54+
"source": [
55+
"Looks as follows, and that's even without output:\n",
56+
"\n",
57+
"```yaml\n",
58+
"{\n",
59+
" \"cells\": [\n",
60+
" {\n",
61+
" \"cell_type\": \"code\",\n",
62+
" \"execution_count\": 1,\n",
63+
" \"id\": \"e7dcf271\",\n",
64+
" \"metadata\": {},\n",
65+
" \"outputs\": [],\n",
66+
" \"source\": [\n",
67+
" \"import numpy as np\"\n",
68+
" ]\n",
69+
" },\n",
70+
" {\n",
71+
" \"cell_type\": \"code\",\n",
72+
" \"execution_count\": 2,\n",
73+
" \"id\": \"24165ce8\",\n",
74+
" \"metadata\": {},\n",
75+
" \"outputs\": [\n",
76+
" {\n",
77+
" \"data\": {\n",
78+
" \"text/plain\": [\n",
79+
" \"array([0. , 0.11111111, 0.22222222, 0.33333333, 0.44444444,\\n\",\n",
80+
" \" 0.55555556, 0.66666667, 0.77777778, 0.88888889, 1. ])\"\n",
81+
" ]\n",
82+
" },\n",
83+
" \"execution_count\": 2,\n",
84+
" \"metadata\": {},\n",
85+
" \"output_type\": \"execute_result\"\n",
86+
" }\n",
87+
" ],\n",
88+
" \"source\": [\n",
89+
" \"np.linspace(0, 1, 10)\"\n",
90+
" ]\n",
91+
" }\n",
92+
" ],\n",
93+
" \"metadata\": {\n",
94+
" \"kernelspec\": {\n",
95+
" \"display_name\": \"mude-base\",\n",
96+
" \"language\": \"python\",\n",
97+
" \"name\": \"python3\"\n",
98+
" },\n",
99+
" \"language_info\": {\n",
100+
" \"codemirror_mode\": {\n",
101+
" \"name\": \"ipython\",\n",
102+
" \"version\": 3\n",
103+
" },\n",
104+
" \"file_extension\": \".py\",\n",
105+
" \"mimetype\": \"text/x-python\",\n",
106+
" \"name\": \"python\",\n",
107+
" \"nbconvert_exporter\": \"python\",\n",
108+
" \"pygments_lexer\": \"ipython3\",\n",
109+
" \"version\": \"3.12.11\"\n",
110+
" }\n",
111+
" },\n",
112+
" \"nbformat\": 4,\n",
113+
" \"nbformat_minor\": 5\n",
114+
"}\n",
115+
"```"
116+
]
117+
},
118+
{
119+
"cell_type": "markdown",
120+
"id": "cca01221",
121+
"metadata": {},
122+
"source": []
123+
}
124+
],
125+
"metadata": {
126+
"language_info": {
127+
"name": "python"
128+
}
129+
},
130+
"nbformat": 4,
131+
"nbformat_minor": 5
132+
}

book/version_control/version_control.md

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -54,13 +54,6 @@ Try exploring a few files on your computer to confirm wether they are text-based
5454
Note that in Windows if you are using Notepad (the default), you will want to select "Word Wrap" under the "Format" menu to fit the contents of very long lines within the visible width of the window.
5555
```
5656

57-
### Jupyter Notebooks
58-
59-
Jupyter notebooks, `ipynb`, are a special case because while the contents of your Markdown and code cells is saved as text in the file, the output of the code cells is sometimes a binary format. For example, if you create a plot using matplotlib and save the notebook, that plot output will be binary. This unfortunately makes it a little more difficult to use notebooks with VCS, but if we are aware of the issue, it is not a problem---we will show you how.
60-
61-
If you tried opening up a notebook in the text editor you would have noticed a structure with curly braces, `{}`. This is [JSON-format](https://en.wikipedia.org/wiki/JSON)] (another file type), which the notebook uses to store information in each cell.
62-
63-
6457
## A different way of thinking?
6558

6659
As you will see in the other chapters on git, when applied to code, version control takes on a very different appearance than what you are used to with traditional backup software, for example, Microsoft Word auto-save, or cloud-based services like OneDrive, Dropbox or even [Visual Studio Code Share](../install/ide/vsc.md). All of these platforms are set up in a user-friendly way that is _focused on a single file._ This works fine when we are writing a report like a thesis. However, it does **not** work well when it comes to computer programs, because in addition to the files themselves, the _contents of the file_ become critical. As we will see, git is a version control software that allows us to compare and track changes in every character of text within a file, which is very useful when writing code, as well as working with a distributed team of collaborators.

0 commit comments

Comments
 (0)