Sorry I’m still confused. You took roughly 7000 pictures in two afternoons? What do you mean by sliced them in bulk? If you took them from different angles how do you slice them in bulk?
By "slicing in bulk" I mean the server was the one that split that out into 81 smaller images rather than the app doing the slicing and uploading 81 small images.
Taking them from different angles was done because the perspective correction adds distortions that I didn't want my model to be sensitive to.
7000 pictures at 5 seconds per picture is "only" 10 hours of work. Possibly per-picture time can be lower than that too. Seems quite doable over 2-4 afternoons.
Props for doing the project end2end, including the non-trivial (and typically skipped) part of collecting training data.