Skip to content

Commit 677a2b8

Browse files
SeregaCoditSerhii Naumenko
andauthored
chore(annotation_converter): add annotation converter command and convertion from voc to yolo format
* chore(annotation_converter): initial commit for annotation converter command and business-logic * chore(converter): move files from annotation_converter to converter directory * chore(converter): add base code for converter and reader * chore(annotation_converter): add voc to yolo format converter * feat(annotation_converter)!: add yolo writer, finish conversation algorithm from voc to yolo * refactor(annotation_converter)!:refactor voc to yolo converter to multiprocessing * refactor(annotation_converter)!:refactor voc to yolo add files writer to convert pipeline for multiprocess writing files * docs(annotation_converter): add docstrings for voc to yolo convertion logic * refactor(dhash): remove commented rows * docs(docs): add mcdocs and documentation --------- Co-authored-by: Serhii Naumenko <naumenko.s.mail@gmail.com>
1 parent 53bca7f commit 677a2b8

46 files changed

Lines changed: 612 additions & 53 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/deploy_docs.yml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
name: Publish Docs
2+
on:
3+
push:
4+
branches:
5+
- main # Запускати тільки при пуші в main
6+
7+
permissions:
8+
contents: write
9+
10+
jobs:
11+
deploy:
12+
runs-on: ubuntu-latest
13+
steps:
14+
- uses: actions/checkout@v4
15+
16+
- name: Set up Python
17+
uses: actions/setup-python@v5
18+
with:
19+
python-version: '3.11'
20+
21+
- name: Install Dependencies
22+
run: |
23+
python -m pip install --upgrade pip
24+
pip install -r requirements.txt
25+
pip install mkdocs-material mkdocstrings[python]
26+
27+
- name: Deploy to GitHub Pages
28+
run: mkdocs gh-deploy --force

README.MD

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,23 @@
1-
# Automatic File Manager
1+
# DataForge
22

3-
A simple way to automate working with files. You can set a time delay for automatic execution of your command. For example:
4-
5-
python fileManager.py move ./Downloads/ ./Videos -p .mp4 .MP4 .mov .MOV -r -s 60
6-
This command will move all files with .mp4 .MP4 .mov .MOV from Downloads to the Videos directory, check the Downloads directory again and do task one more time until there is no files that match patterns in Downloads, then FileManager will be waiting for 60 seconds and check Downloads again.
3+
A simple way to automate working with datasets. You can set a time delay for automatic execution of your command.
74

85
if you don’t want the command works in a cycle, just don't use "-r" argument. And it will be executed for one time.
96

107

118
## Available commands
129
- **move** - move files from source directory to target directory
1310
- **slice** - slice video files to images from the source directory to the target directory. Also, you can set flag "--remove" or "-rm" for deleting a source video file after slicing
11+
1412
- **delete** - delete files that match patterns from source directory
1513
- **dedup** - find duplicates in source directory that matches a pattern. An image means a duplicate if it's hash has lower
1614
Hamming distance with comparing image hash than threshold value. The threshold value setups in percentage and must be in range [0, 100]. Pay attention to core_size parameter: the lower value makes details at photo less important, and the higher value makes details mach important while comparing information at images. It’s implemented only dHash comparing method for now.
1715
- **clean-annotations** - find annotation files in directory that doesn't have corresponding files
16+
- **convert-annotations** - converts annotations from source format to destination format
17+
18+
#### to see command syntax and arguments use:
19+
python data_forge.py <command> -h
20+
1821
## How to use:
1922
clone git repository:
2023

@@ -36,11 +39,11 @@ read the --help command for learn more about available commands and arguments:
3639

3740
for check available commands
3841

39-
python fileManager.py --help
42+
python data_forge.py --help
4043

4144
for check the command usage and available arguments
4245

43-
python fileManager.py {command} --help
46+
python data_forge.py {command} --help
4447

4548

4649
## What else?
@@ -51,5 +54,5 @@ For more comfortable using FileManager with multiple tasks you can create an .sh
5154

5255
for stop executing of all commands use:
5356

54-
pkill -f fileManager.py
57+
pkill -f data_forge.py
5558

const_utils/annotation.py

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
from dataclasses import dataclass
2+
from pathlib import Path
3+
from typing import Tuple, Dict, Optional
4+
5+
from logger.log_level_mapping import LevelMapping
6+
from logger.logger import LoggerConfigurator
7+
8+
9+
class ObjectAnnotation:
10+
def __init__(self, log_level: str = LevelMapping.debug, log_path: Optional[Path] = None, **kwargs):
11+
self.imsize: Tuple[int, int] = kwargs.get("imsize")
12+
self.name: str = kwargs.get("name")
13+
self.pose: str = kwargs.get("pose", 'Unspecified')
14+
self.truncated: int = kwargs.get("truncated", 0)
15+
self.difficult: int = kwargs.get("difficult", 0)
16+
self.bndbox: Dict[str, int] = kwargs.get("bndbox", {})
17+
self.width: int = None
18+
self.height: int = None
19+
self.x_center: int = None
20+
self.y_center: int = None
21+
self.area: int = None
22+
self.aspect_ratio: int = None
23+
self.relative_area: float = None
24+
25+
self.logger = LoggerConfigurator.setup(
26+
name=self.__class__.__name__,
27+
log_level=log_level,
28+
log_path=Path(log_path) / f"{self.__class__.__name__}.log" if log_path else None
29+
)
30+
31+
@property
32+
def area(self) -> int:
33+
return self._area
34+
35+
@area.setter
36+
def area(self, value: int) -> None:
37+
if isinstance(value, int):
38+
self._area = value
39+
else:
40+
try:
41+
self._area = int(float(value))
42+
except TypeError as e:
43+
error_text = f"Area must be an integer, got {value}"
44+
self.logger.warning(error_text)
45+
raise TypeError(e)
46+
47+
@property
48+
def width(self) -> int:
49+
return self._width
50+
51+

const_utils/arguments.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
class Arguments:
55
"""Command arguments"""
66
src: str = "src"
7-
dst: str = "dst"
7+
dst: str = "--dst"
8+
89
pattern: str = "--pattern"
910
p: str = "-p"
1011
repeat: str = "--repeat"
@@ -30,3 +31,4 @@ class Arguments:
3031
cache_name: str = "--cache_name"
3132
a_suffix: str = "--a_suffix"
3233
a_source: str = "--a_source"
34+
destination_type: str = "--destination-type"

const_utils/commands.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,5 @@ class Commands:
77
slice: str = "slice"
88
delete: str = "delete"
99
dedup: str = "dedup"
10-
clean_annotations: str = "clean-annotations"
10+
clean_annotations: str = "clean-annotations"
11+
convert_annotations: str = "convert-annotations"

const_utils/default_values.py

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ class AppSettings(BaseSettings):
4040
cache_name: Optional[Path] = Field(default=None)
4141
a_suffix: Tuple[str, ...] = Field(default_factory=tuple)
4242
a_source: Optional[Path] = Field(default=None)
43+
destination_type: Optional[str] = Field(default=None)
4344

4445
@field_validator('core_size')
4546
@classmethod
@@ -55,7 +56,17 @@ def ensure_path(cls, value: Union[str, Path]) -> Path:
5556
return Path(value)
5657
return value
5758

58-
59+
@field_validator("n_jobs")
60+
@classmethod
61+
def ensure_n_jobs(cls, value: Union[int, str]) -> int:
62+
if not isinstance(value, int):
63+
return int(float(value))
64+
elif value >= multiprocessing.cpu_count():
65+
return multiprocessing.cpu_count() - 1
66+
elif value < 1:
67+
return 1
68+
else:
69+
return value
5970

6071
@classmethod
6172
def load_config(cls, config_path: Path = Constants.config_file) -> "AppSettings":

const_utils/parser_help.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,4 +32,5 @@ class HelpStrings:
3232
"with next signature: <cache_{path_hash}_d{folder_name}{hash_type}s{core_size}.pkl>")
3333
a_suffix: str = "A suffix pattern for annotations"
3434
a_source: str = ("A source directory to annotations. If None - that means annotations are in the same folder with"
35-
" images")
35+
" images")
36+
destination_type: str = "A type of destination annotation format"

fileManager.py renamed to data_forge.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,15 @@
55
from const_utils.parser_help import HelpStrings as hs
66
from const_utils.commands import Commands
77
from const_utils.arguments import Arguments as arg
8-
# from const_utils.default_values import DefaultValues as defaults
8+
from file_operations.convert_annotations import ConvertAnnotationsOperation
99
from file_operations.deduplicate import DedupOperation
1010
from file_operations.delete import DeleteOperation
1111
from file_operations.move import MoveOperation
1212
from file_operations.slice import SliceOperation
1313
from file_operations.clean_annotations import CleanAnnotationsOperation
1414

1515

16-
class FileManager:
16+
class DataForge:
1717
"""Class corresponding to CLI and launch command"""
1818
def __init__(self):
1919
self.parser = argparse.ArgumentParser(description="FileManager")
@@ -23,7 +23,8 @@ def __init__(self):
2323
Commands.slice: SliceOperation,
2424
Commands.delete: DeleteOperation,
2525
Commands.dedup: DedupOperation,
26-
Commands.clean_annotations: CleanAnnotationsOperation
26+
Commands.clean_annotations: CleanAnnotationsOperation,
27+
Commands.convert_annotations: ConvertAnnotationsOperation
2728
}
2829
self.settings = AppSettings.load_config(Constants.config_file)
2930
self._setup_commands()
@@ -65,5 +66,5 @@ def execute(self):
6566

6667
if __name__ == "__main__":
6768

68-
app = FileManager()
69+
app = DataForge()
6970
app.execute()

docs/api/base_hasher.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
::: tools.comparer.img_comparer.hasher.base_hasher.BaseHasher

docs/api/converter.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
::: tools.annotation_converter.converter.base.BaseConverter

0 commit comments

Comments
 (0)