You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactor codebase for improved structure and readability
- Organize utility functions into dedicated modules.
- Separate benchmarking and prediction logic into distinct directories.
- Streamline model initialization for CUDA, ONNX, and other environments.
- Enhance benchmark visualization and address Seaborn deprecation warnings.
- Improve error handling and logging for better debugging.
-`--image_path`: (Optional) Specifies the path to the image you want to predict.
52
53
-`--topk`: (Optional) Specifies the number of top predictions to show. Defaults to 5 if not provided.
53
-
-`--mode`: Specifies the mode for exporting and running the model. Choices are: `onnx`, `ov`, `all`.
54
+
-`--mode`: (Optional) Specifies the mode for exporting and running the model. Choices are: `onnx`, `ov`,`all`. If not provided, it defaults to`all`.
54
55
55
56
### Example Command
56
57
```sh
57
-
python src/main.py --topk 3 --mode=all
58
+
python main.py --topk 3 --mode=ov
58
59
```
59
60
60
-
This command will run predictions on the default image (`./inference/cat3.jpg`), show the top 3 predictions, and run all models (PyTorch CPU, CUDA, ONNX, OV, TRT-FP16, TRT-FP32). At the end results plot will be saved to `./inference/plot.png`
61
+
This command will run predictions on the default image (`./inference/cat3.jpg`), show the top 3 predictions, and run OpenVINO model. Note: plot created only for `--mode=all` and results plotted and saved to `./inference/plot.png`
61
62
62
63
## RESULTS
63
64
### Inference Benchmark Results
@@ -76,6 +77,15 @@ Here is an example of the input image to run predictions and benchmarks on:
76
77
77
78
<imgsrc="./inference/cat3.jpg"width="20%">
78
79
80
+
### Example prediction results
81
+
```
82
+
#1: 15% Egyptian cat
83
+
#2: 14% tiger cat
84
+
#3: 9% tabby
85
+
#4: 2% doormat
86
+
#5: 2% lynx
87
+
```
88
+
79
89
## Benchmark Implementation Details
80
90
Here you can see the flow for each model and benchmark.
81
91
@@ -116,62 +126,8 @@ OpenVINO is a toolkit from Intel that optimizes deep learning model inference fo
116
126
4. Perform inference on the provided image using the OpenVINO model.
117
127
5. Benchmark results, including average inference time, are logged for the OpenVINO model.
118
128
119
-
## Used methodologies
120
-
### TensorRT Optimization
121
-
TensorRT is a high-performance deep learning inference optimizer and runtime library developed by NVIDIA. It is designed for optimizing and deploying trained neural network models on production environments. This project supports TensorRT optimizations in FP32 (single precision) and FP16 (half precision) modes, offering different trade-offs between inference speed and model accuracy.
122
-
123
-
#### Features
124
-
-**Performance Boost**: TensorRT can significantly accelerate the inference of neural network models, making it suitable for deployment in resource-constrained environments.
125
-
-**Precision Modes**: Supports FP32 for maximum accuracy and FP16 for faster performance with a minor trade-off in accuracy.
126
-
-**Layer Fusion**: TensorRT fuses layers and tensors in the neural network to reduce memory access overhead and improve execution speed.
127
-
-**Dynamic Tensor Memory**: Efficiently handles varying batch sizes without re-optimization.
128
-
129
-
#### Usage
130
-
When running the main script, use the'- mode all' argument to employ TensorRT optimizations in the project.
131
-
This will initiate all models, including PyTorch models, that will be compiled to the TRT model with `FP16` and `FP32` precision modes. Then, in one of the steps, we will run inference on the specified image using the TensorRT-optimized model.
132
-
Example:
133
-
```sh
134
-
python src/main.py --mode all
135
-
```
136
-
#### Requirements
137
-
Ensure you have the TensorRT library and the torch_tensorrt package installed in your environment. Also, for FP16 optimizations, it's recommended to have a GPU that supports half-precision arithmetic (like NVIDIA GPUs with Tensor Cores).
138
-
139
-
### ONNX Exporter
140
-
ONNX Model Exporter (`ONNXExporter`) utility is incorporated within this project to enable converting the native PyTorch model into the ONNX format.
141
-
Using the ONNX format, inference and benchmarking can be performed with the ONNX Runtime, which offers platform-agnostic optimizations and is widely supported across numerous platforms and devices.
142
-
143
-
#### Features
144
-
-**Standardized Format**: ONNX provides an open-source format for AI models. It defines an extensible computation graph model and definitions of built-in operators and standard data types.
145
-
-**Interoperability**: Models in ONNX format can be used across various frameworks, tools, runtimes, and compilers.
146
-
-**Optimizations**: The ONNX Runtime provides performance optimizations for both cloud and edge devices.
147
-
148
-
#### Usage
149
-
To leverage the `ONNXExporter` and conduct inference using the ONNX Runtime, utilize the `--mode onnx` argument when executing the main script.
150
-
This will initiate the conversion process and then run inference on the specified image using the ONNX model.
151
-
Example:
152
-
```sh
153
-
python src/main.py --mode onnx
154
-
```
155
-
156
-
#### Requirements
157
-
Ensure the ONNX library is installed in your environment to use the ONNXExporter. Additionally, if you want to run inference using the ONNX model, install the ONNX Runtime.
158
-
159
-
### OV Exporter
160
-
OpenVINO Model Exporter utility (`OVExporter`) has been integrated into this project to facilitate the conversion of the ONNX model to the OpenVINO format.
161
-
This enables inference and benchmarking using OpenVINO, a framework optimized for Intel hardware, providing substantial speed improvements, especially on CPUs.
162
-
163
-
#### Features
164
-
-**Model Optimization**: Converts the ONNX model to OpenVINO's Intermediate Representation (IR) format. This optimized format allows for faster inference times on Intel hardware.
165
-
-**Versatility**: OpenVINO can target various Intel hardware devices such as CPUs, integrated GPUs, FPGAs, and VPUs.
166
-
-**Ease of Use**: The `OVExporter` seamlessly transitions from ONNX to OpenVINO, abstracting the conversion details and providing a straightforward interface.
167
-
168
-
#### Usage
169
-
To utilize `OVExporter` and perform inference using OpenVINO, use the `--mode ov` argument when running the main script.
170
-
This will trigger the conversion process and subsequently run inference on the provided image using the optimized OpenVINO model.
171
-
Example:
172
-
```sh
173
-
python src/main.py --mode ov
174
-
```
129
+
## Benchmarking and Visualization
130
+
The results of the benchmarks for all modes are saved and visualized in a bar chart, showcasing the average inference times across different backends. The visualization aids in comparing the performance gains achieved with different optimizations.
175
131
176
132
#### Requirements
177
133
Ensure you have installed the OpenVINO Toolkit and the necessary dependencies to use OpenVINO's model optimizer and inference engine.
0 commit comments