Decoding whole tracks in memory seems pretty wasteful (and then getting entire waveforms on top, though this part can probably be limited). I guess it's okay if you're looping a thirty-second track but not so much if it's twenty minutes, like mentioned in another comment. Makes me wonder if compressed audio can be used directly in such cases.