Uses DINO to detect regions and further refines them with SAM. Returns masking data as RLE encoded JSON.