Skip to content

1. Downloading 25000 images

shell
cerry@ASUS-C:~/p/l/yolo►python ./prepare_yolov8_dataset_3.py                                       (yolov5) 7.107s 00:16
--- Starting Data Preparation for 42 Food Classes ---
Target classes: ['Apple', 'Orange', 'Strawberry', 'Grape', 'Banana', 'Pear', 'Peach', 'Lemon', 'Watermelon', 'Cantaloupe', 'Pomegranate', 'Tomato', 'Cucumber', 'Carrot', 'Broccoli', 'Bell pepper', 'Mushroom', 'Potato', 'Cabbage', 'Pumpkin', 'Radish', 'Garden Asparagus', 'Zucchini', 'Winter melon', 'Chicken', 'Shrimp', 'Crab', 'Fish', 'Egg (Food)', 'Seafood', 'Milk', 'Cheese', 'Dairy Product', 'Bread', 'Baked goods', 'Pasta', 'Cookie', 'Pastry', 'Coffee', 'Tea', 'Juice', 'Drink']
Samples per class: 150-250
Initial download pool size: 20000
--------------------------------------------------
Step 1/3: Loading (or downloading) an initial dataset pool.
Dataset 'open-images-v7-selected-42-food-classes' already exists, loading it from local FiftyOne database.
Initial dataset pool loaded with 11243 samples.
--------------------------------------------------

--- DEBUG INFO: Dataset Schema ---
Dataset fields:
  id: fiftyone.core.fields.ObjectIdField
  filepath: fiftyone.core.fields.StringField
  tags: fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
  metadata: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
  created_at: fiftyone.core.fields.DateTimeField
  last_modified_at: fiftyone.core.fields.DateTimeField
  ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)

Dataset first sample details (if available):
<Sample: {
    'id': '68bc5b84edc3e8838faf3234',
    'media_type': 'image',
    'filepath': '/home/cerry/fiftyone/open-images-v7/train/data/015af843d0b60389.jpg',
    'tags': ['train'],
    'metadata': None,
    'created_at': datetime.datetime(2025, 9, 6, 16, 4, 20, 301000),
    'last_modified_at': datetime.datetime(2025, 9, 6, 16, 4, 20, 301000),
    'ground_truth': <Detections: {
        'detections': [
            <Detection: {
                'id': '68bc5b84edc3e8838faf3233',
                'attributes': {},
                'tags': [],
                'label': 'Shrimp',
                'bounding_box': [0.1025, 0.32389, 0.8687499999999999, 0.46270100000000003],
                'mask': None,
                'mask_path': None,
                'confidence': None,
                'index': None,
                'IsOccluded': False,
                'IsTruncated': False,
                'IsGroupOf': False,
                'IsDepiction': False,
                'IsInside': False,
            }>,
        ],
    }>,
}>

'ground_truth' field type on first sample: <class 'fiftyone.core.labels.Detections'>
Number of detections on first sample: 1
First detection label: Shrimp
----------------------------------

Step 2/3: Filtering samples per class to meet target counts.
  Processing class: 'Apple'...
  Processing class: 'Orange'...
  Processing class: 'Strawberry'...
  Processing class: 'Grape'...
  Processing class: 'Banana'...
  Processing class: 'Pear'...
    WARNING: Not enough unique samples for class 'Pear'. Found 44, need 150.
  Processing class: 'Peach'...
    WARNING: Not enough unique samples for class 'Peach'. Found 36, need 150.
  Processing class: 'Lemon'...
    WARNING: Not enough unique samples for class 'Lemon'. Found 106, need 150.
  Processing class: 'Watermelon'...
    WARNING: Not enough unique samples for class 'Watermelon'. Found 98, need 150.
  Processing class: 'Cantaloupe'...
    WARNING: Not enough unique samples for class 'Cantaloupe'. Found 11, need 150.
  Processing class: 'Pomegranate'...
    WARNING: Not enough unique samples for class 'Pomegranate'. Found 46, need 150.
  Processing class: 'Tomato'...
  Processing class: 'Cucumber'...
    WARNING: Not enough unique samples for class 'Cucumber'. Found 94, need 150.
  Processing class: 'Carrot'...
    WARNING: Not enough unique samples for class 'Carrot'. Found 138, need 150.
  Processing class: 'Broccoli'...
    WARNING: Not enough unique samples for class 'Broccoli'. Found 94, need 150.
  Processing class: 'Bell pepper'...
    WARNING: Not enough unique samples for class 'Bell pepper'. Found 79, need 150.
  Processing class: 'Mushroom'...
  Processing class: 'Potato'...
    WARNING: Not enough unique samples for class 'Potato'. Found 88, need 150.
  Processing class: 'Cabbage'...
    WARNING: Not enough unique samples for class 'Cabbage'. Found 81, need 150.
  Processing class: 'Pumpkin'...
  Processing class: 'Radish'...
    WARNING: Not enough unique samples for class 'Radish'. Found 48, need 150.
  Processing class: 'Garden Asparagus'...
    WARNING: Not enough unique samples for class 'Garden Asparagus'. Found 74, need 150.
  Processing class: 'Zucchini'...
    WARNING: Not enough unique samples for class 'Zucchini'. Found 15, need 150.
  Processing class: 'Winter melon'...
    WARNING: Not enough unique samples for class 'Winter melon'. Found 14, need 150.
  Processing class: 'Chicken'...
  Processing class: 'Shrimp'...
    WARNING: Not enough unique samples for class 'Shrimp'. Found 121, need 150.
  Processing class: 'Crab'...
  Processing class: 'Fish'...
  Processing class: 'Egg (Food)'...
  Processing class: 'Seafood'...
  Processing class: 'Milk'...
    WARNING: Not enough unique samples for class 'Milk'. Found 35, need 150.
  Processing class: 'Cheese'...
    WARNING: Not enough unique samples for class 'Cheese'. Found 139, need 150.
  Processing class: 'Dairy Product'...
  Processing class: 'Bread'...
  Processing class: 'Baked goods'...
  Processing class: 'Pasta'...
  Processing class: 'Cookie'...
  Processing class: 'Pastry'...
  Processing class: 'Coffee'...
  Processing class: 'Tea'...
    WARNING: Not enough unique samples for class 'Tea'. Found 140, need 150.
  Processing class: 'Juice'...
  Processing class: 'Drink'...

Finished filtering. Total unique samples selected for final dataset: 6669.
 100% |█████████████████████████████████████████████| 6669/6669 [15.5s elapsed, 0s remaining, 545.5 samples/s]
Final filtered dataset 'open-images-v7-selected-42-food-classes-filtered' created with 6669 samples.
--------------------------------------------------
Launching FiftyOne App to visualize the filtered dataset (optional, close browser tab to continue script)...
App launched. Point your web browser to http://localhost:5151
Step 3/3: Exporting dataset to YOLOv8 format in 'yolov8_42_food_classes_data'...
Directory 'yolov8_42_food_classes_data' already exists; export will be merged with existing files
 100% |█████████████████████████████████████████████| 6669/6669 [25.1s elapsed, 0s remaining, 323.9 samples/s]
Export complete! Data for YOLOv8 training is in 'yolov8_42_food_classes_data'.
You will find 'images/train/', 'labels/train/', 'classes.txt', and 'dataset.yaml' inside 'yolov8_42_food_classes_data'.
--------------------------------------------------

2. view