-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathGroup Project Introduction Writting.txt
More file actions
15 lines (10 loc) · 1.88 KB
/
Group Project Introduction Writting.txt
File metadata and controls
15 lines (10 loc) · 1.88 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Question 1
Our data set is regarding Cargo theft in the USA as reported by the FBI. The FBI defines cargo as physical items as well as cyber and documentation fraud. The data available runs from 2013 until 2019 and there is roughly 87000 rows of data included. The data was collected by the Uniform Crime Reporting (UCR) program. The UCR program is an American program that collects crime data from more than 18,000 agencies across America who voluntarily submit their data. There are lots of interesting variables including location of theft, value of goods stolen, value of goods recovered, whether or not the goods were recovered and population size of area where the goods were stolen. Some other variables are included such as offender age, race and gender or whether a weapon was used, however these columns have a lot of missing values so we likely will not be including these.
Question 2
There are many very interesting questions we may explore with this data set including:
-What areas are the most prone to theft?
-What kinds of goods are most commonly stolen?
-What is the average difference in value between value of goods stolen and recovered or what is the percentage of value recovered?
-What years were particularly bad for crime?
-Is there a relationship between population size and crime rate?
With some of these questions in mind, we can use our data analysis to start to ask questions of how to prevent these crimes or what to do to best mitigate the losses. Although this data is from the USA we can take our analysis and apply it to a more local scale or even expand the analysis globally. This is part of the reason we are interested in this data as it has a widespread area of effect. We expect when we build a dashboard at the end that the data will be nice to represent as we can plot a map of the USA using our data to better visualize some of the questions we had above.