lucataco
/
stable-diffusion-3.5-large-lora-trainer
Fine-tune StableDiffusion3.5-Large with Hugging Face Diffusers
Prediction
lucataco/stable-diffusion-3.5-large-lora-trainer:cd6419a53b69fd410a912d945fa481a2a9ecfc4ab93062ed76c53f6e617f89e9IDng14j2cff1rj40cjrr2vbz667mStatusSucceededSourceWebHardwareA100 (80GB)Total durationCreatedby @lucatacoInput
- rank
- 16
- backend
- no
- optimizer
- AdamW
- resolution
- 768
- input_images
- yarn.zip
- lr_scheduler
- constant
- learning_rate
- 0.0001
- instance_prompt
- Frog, yarn art style
- max_train_steps
- 700
- train_batch_size
- 1
- gradient_accumulation_steps
- 1
{ "rank": 16, "backend": "no", "optimizer": "AdamW", "resolution": 768, "input_images": "https://replicate.delivery/pbxt/LrJveDd3TVKraYSxEWkMl0txKP39KdIBof5EO2IAsuTNIrFU/yarn.zip", "lr_scheduler": "constant", "learning_rate": 0.0001, "instance_prompt": "Frog, yarn art style", "max_train_steps": 700, "train_batch_size": 1, "gradient_accumulation_steps": 1 }
Install Replicate’s Node.js client library:npm install replicate
Import and set up the client:import Replicate from "replicate"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN, });
Run lucataco/stable-diffusion-3.5-large-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const output = await replicate.run( "lucataco/stable-diffusion-3.5-large-lora-trainer:cd6419a53b69fd410a912d945fa481a2a9ecfc4ab93062ed76c53f6e617f89e9", { input: { rank: 16, backend: "no", optimizer: "AdamW", resolution: 768, input_images: "https://replicate.delivery/pbxt/LrJveDd3TVKraYSxEWkMl0txKP39KdIBof5EO2IAsuTNIrFU/yarn.zip", lr_scheduler: "constant", learning_rate: 0.0001, instance_prompt: "Frog, yarn art style", max_train_steps: 700, train_batch_size: 1, gradient_accumulation_steps: 1 } } ); // To access the file URL: console.log(output.url()); //=> "http://example.com" // To write the file to disk: fs.writeFile("my-image.png", output);
To learn more, take a look at the guide on getting started with Node.js.
Install Replicate’s Python client library:pip install replicate
Import the client:import replicate
Run lucataco/stable-diffusion-3.5-large-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run( "lucataco/stable-diffusion-3.5-large-lora-trainer:cd6419a53b69fd410a912d945fa481a2a9ecfc4ab93062ed76c53f6e617f89e9", input={ "rank": 16, "backend": "no", "optimizer": "AdamW", "resolution": 768, "input_images": "https://replicate.delivery/pbxt/LrJveDd3TVKraYSxEWkMl0txKP39KdIBof5EO2IAsuTNIrFU/yarn.zip", "lr_scheduler": "constant", "learning_rate": 0.0001, "instance_prompt": "Frog, yarn art style", "max_train_steps": 700, "train_batch_size": 1, "gradient_accumulation_steps": 1 } ) print(output)
To learn more, take a look at the guide on getting started with Python.
Run lucataco/stable-diffusion-3.5-large-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H "Content-Type: application/json" \ -H "Prefer: wait" \ -d $'{ "version": "cd6419a53b69fd410a912d945fa481a2a9ecfc4ab93062ed76c53f6e617f89e9", "input": { "rank": 16, "backend": "no", "optimizer": "AdamW", "resolution": 768, "input_images": "https://replicate.delivery/pbxt/LrJveDd3TVKraYSxEWkMl0txKP39KdIBof5EO2IAsuTNIrFU/yarn.zip", "lr_scheduler": "constant", "learning_rate": 0.0001, "instance_prompt": "Frog, yarn art style", "max_train_steps": 700, "train_batch_size": 1, "gradient_accumulation_steps": 1 } }' \ https://api.replicate.com/v1/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
Output
{ "completed_at": "2024-10-25T23:39:14.239794Z", "created_at": "2024-10-25T23:31:46.168000Z", "data_removed": false, "error": null, "id": "ng14j2cff1rj40cjrr2vbz667m", "input": { "rank": 16, "backend": "no", "optimizer": "AdamW", "resolution": 768, "input_images": "https://replicate.delivery/pbxt/LrJveDd3TVKraYSxEWkMl0txKP39KdIBof5EO2IAsuTNIrFU/yarn.zip", "lr_scheduler": "constant", "learning_rate": 0.0001, "instance_prompt": "Frog, yarn art style", "max_train_steps": 700, "train_batch_size": 1, "gradient_accumulation_steps": 1 }, "logs": "Using seed: 3595070789\nExtracted 16 files from zip to input_images\nUsing params: ['accelerate', 'launch', '--dynamo_backend', 'no', 'train_dreambooth_lora_sd3.py', '--pretrained_model_name_or_path', 'stable-diffusion-3.5-large', '--instance_data_dir', 'input_images', '--rank', '16', '--output_dir', '/tmp/train/output/sd35_large_train_replicate', '--mixed_precision', 'bf16', '--instance_prompt', 'Frog, yarn art style', '--resolution', '768', '--train_batch_size', '1', '--gradient_accumulation_steps', '1', '--optimizer', 'AdamW', '--learning_rate', '0.0001', '--lr_scheduler', 'constant', '--lr_warmup_steps', '0', '--max_train_steps', '700', '--checkpointing_steps', '701', '--seed', '3595070789', '--logging_dir', '/tmp/logs']\n10/25/2024 23:33:02 - INFO - __main__ - Distributed environment: DistributedType.NO\nNum processes: 1\nProcess index: 0\nLocal process index: 0\nDevice: cuda\nMixed precision type: bf16\nYou set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers\nYou are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.\nYou are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.\nYou are using a model of type t5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.\n{'base_image_seq_len', 'base_shift', 'max_shift', 'max_image_seq_len', 'use_dynamic_shifting'} was not found in config. Values will be initialized to default values.\nLoading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]\nLoading checkpoint shards: 50%|█████ | 1/2 [00:03<00:03, 3.67s/it]\nLoading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00, 3.64s/it]\nLoading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00, 3.64s/it]\n{'dual_attention_layers'} was not found in config. Values will be initialized to default values.\n10/25/2024 23:33:53 - INFO - __main__ - ***** Running training *****\n10/25/2024 23:33:53 - INFO - __main__ - Num examples = 16\n10/25/2024 23:33:53 - INFO - __main__ - Num batches each epoch = 16\n10/25/2024 23:33:53 - INFO - __main__ - Num Epochs = 44\n10/25/2024 23:33:53 - INFO - __main__ - Instantaneous batch size per device = 1\n10/25/2024 23:33:53 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 1\n10/25/2024 23:33:53 - INFO - __main__ - Gradient Accumulation steps = 1\n10/25/2024 23:33:53 - INFO - __main__ - Total optimization steps = 700\nSteps: 0%| | 0/700 [00:00<?, ?it/s]\nSteps: 0%| | 1/700 [00:00<07:26, 1.56it/s]\nSteps: 0%| | 1/700 [00:00<07:26, 1.56it/s, loss=0.132, lr=0.0001]\nSteps: 0%| | 2/700 [00:01<05:50, 1.99it/s, loss=0.132, lr=0.0001]\nSteps: 0%| | 2/700 [00:01<05:50, 1.99it/s, loss=0.189, lr=0.0001]\nSteps: 0%| | 3/700 [00:01<05:27, 2.13it/s, loss=0.189, lr=0.0001]\nSteps: 0%| | 3/700 [00:01<05:27, 2.13it/s, loss=0.0392, lr=0.0001]\nSteps: 1%| | 4/700 [00:01<05:17, 2.20it/s, loss=0.0392, lr=0.0001]\nSteps: 1%| | 4/700 [00:01<05:17, 2.20it/s, loss=0.203, lr=0.0001] \nSteps: 1%| | 5/700 [00:02<05:10, 2.24it/s, loss=0.203, lr=0.0001]\nSteps: 1%| | 5/700 [00:02<05:10, 2.24it/s, loss=0.165, lr=0.0001]\nSteps: 1%| | 6/700 [00:02<05:07, 2.26it/s, loss=0.165, lr=0.0001]\nSteps: 1%| | 6/700 [00:02<05:07, 2.26it/s, loss=0.175, lr=0.0001]\nSteps: 1%| | 7/700 [00:03<05:04, 2.27it/s, loss=0.175, lr=0.0001]\nSteps: 1%| | 7/700 [00:03<05:04, 2.27it/s, loss=0.171, lr=0.0001]\nSteps: 1%| | 8/700 [00:03<05:02, 2.28it/s, loss=0.171, lr=0.0001]\nSteps: 1%| | 8/700 [00:03<05:02, 2.28it/s, loss=0.141, lr=0.0001]\nSteps: 1%|▏ | 9/700 [00:04<05:01, 2.29it/s, loss=0.141, lr=0.0001]\nSteps: 1%|▏ | 9/700 [00:04<05:01, 2.29it/s, loss=0.203, lr=0.0001]\nSteps: 1%|▏ | 10/700 [00:04<05:00, 2.30it/s, loss=0.203, lr=0.0001]\nSteps: 1%|▏ | 10/700 [00:04<05:00, 2.30it/s, loss=0.0762, lr=0.0001]\nSteps: 2%|▏ | 11/700 [00:04<04:59, 2.30it/s, loss=0.0762, lr=0.0001]\nSteps: 2%|▏ | 11/700 [00:04<04:59, 2.30it/s, loss=0.0826, lr=0.0001]\nSteps: 2%|▏ | 12/700 [00:05<04:59, 2.30it/s, loss=0.0826, lr=0.0001]\nSteps: 2%|▏ | 12/700 [00:05<04:59, 2.30it/s, loss=0.19, lr=0.0001] \nSteps: 2%|▏ | 13/700 [00:05<04:59, 2.30it/s, loss=0.19, lr=0.0001]\nSteps: 2%|▏ | 13/700 [00:05<04:59, 2.30it/s, loss=0.285, lr=0.0001]\nSteps: 2%|▏ | 14/700 [00:06<04:58, 2.30it/s, loss=0.285, lr=0.0001]\nSteps: 2%|▏ | 14/700 [00:06<04:58, 2.30it/s, loss=0.144, lr=0.0001]\nSteps: 2%|▏ | 15/700 [00:06<04:57, 2.30it/s, loss=0.144, lr=0.0001]\nSteps: 2%|▏ | 15/700 [00:06<04:57, 2.30it/s, loss=0.134, lr=0.0001]\nSteps: 2%|▏ | 16/700 [00:07<04:56, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 2%|▏ | 16/700 [00:07<04:56, 2.31it/s, loss=0.189, lr=0.0001]\nSteps: 2%|▏ | 17/700 [00:07<04:57, 2.30it/s, loss=0.189, lr=0.0001]\nSteps: 2%|▏ | 17/700 [00:07<04:57, 2.30it/s, loss=0.097, lr=0.0001]\nSteps: 3%|▎ | 18/700 [00:07<04:56, 2.30it/s, loss=0.097, lr=0.0001]\nSteps: 3%|▎ | 18/700 [00:08<04:56, 2.30it/s, loss=0.215, lr=0.0001]\nSteps: 3%|▎ | 19/700 [00:08<04:55, 2.30it/s, loss=0.215, lr=0.0001]\nSteps: 3%|▎ | 19/700 [00:08<04:55, 2.30it/s, loss=0.173, lr=0.0001]\nSteps: 3%|▎ | 20/700 [00:08<04:55, 2.30it/s, loss=0.173, lr=0.0001]\nSteps: 3%|▎ | 20/700 [00:08<04:55, 2.30it/s, loss=0.0768, lr=0.0001]\nSteps: 3%|▎ | 21/700 [00:09<04:54, 2.30it/s, loss=0.0768, lr=0.0001]\nSteps: 3%|▎ | 21/700 [00:09<04:54, 2.30it/s, loss=0.0714, lr=0.0001]\nSteps: 3%|▎ | 22/700 [00:09<04:54, 2.30it/s, loss=0.0714, lr=0.0001]\nSteps: 3%|▎ | 22/700 [00:09<04:54, 2.30it/s, loss=0.148, lr=0.0001] \nSteps: 3%|▎ | 23/700 [00:10<04:54, 2.30it/s, loss=0.148, lr=0.0001]\nSteps: 3%|▎ | 23/700 [00:10<04:54, 2.30it/s, loss=0.297, lr=0.0001]\nSteps: 3%|▎ | 24/700 [00:10<04:53, 2.30it/s, loss=0.297, lr=0.0001]\nSteps: 3%|▎ | 24/700 [00:10<04:53, 2.30it/s, loss=0.0754, lr=0.0001]\nSteps: 4%|▎ | 25/700 [00:11<04:53, 2.30it/s, loss=0.0754, lr=0.0001]\nSteps: 4%|▎ | 25/700 [00:11<04:53, 2.30it/s, loss=0.116, lr=0.0001] \nSteps: 4%|▎ | 26/700 [00:11<04:52, 2.30it/s, loss=0.116, lr=0.0001]\nSteps: 4%|▎ | 26/700 [00:11<04:52, 2.30it/s, loss=0.0963, lr=0.0001]\nSteps: 4%|▍ | 27/700 [00:11<04:52, 2.30it/s, loss=0.0963, lr=0.0001]\nSteps: 4%|▍ | 27/700 [00:11<04:52, 2.30it/s, loss=0.0578, lr=0.0001]\nSteps: 4%|▍ | 28/700 [00:12<04:51, 2.30it/s, loss=0.0578, lr=0.0001]\nSteps: 4%|▍ | 28/700 [00:12<04:51, 2.30it/s, loss=0.0973, lr=0.0001]\nSteps: 4%|▍ | 29/700 [00:12<04:51, 2.30it/s, loss=0.0973, lr=0.0001]\nSteps: 4%|▍ | 29/700 [00:12<04:51, 2.30it/s, loss=0.116, lr=0.0001] \nSteps: 4%|▍ | 30/700 [00:13<04:51, 2.30it/s, loss=0.116, lr=0.0001]\nSteps: 4%|▍ | 30/700 [00:13<04:51, 2.30it/s, loss=0.191, lr=0.0001]\nSteps: 4%|▍ | 31/700 [00:13<04:50, 2.30it/s, loss=0.191, lr=0.0001]\nSteps: 4%|▍ | 31/700 [00:13<04:50, 2.30it/s, loss=0.113, lr=0.0001]\nSteps: 5%|▍ | 32/700 [00:14<04:49, 2.30it/s, loss=0.113, lr=0.0001]\nSteps: 5%|▍ | 32/700 [00:14<04:49, 2.30it/s, loss=0.187, lr=0.0001]\nSteps: 5%|▍ | 33/700 [00:14<04:50, 2.29it/s, loss=0.187, lr=0.0001]\nSteps: 5%|▍ | 33/700 [00:14<04:50, 2.29it/s, loss=0.104, lr=0.0001]\nSteps: 5%|▍ | 34/700 [00:14<04:50, 2.30it/s, loss=0.104, lr=0.0001]\nSteps: 5%|▍ | 34/700 [00:14<04:50, 2.30it/s, loss=0.176, lr=0.0001]\nSteps: 5%|▌ | 35/700 [00:15<04:49, 2.30it/s, loss=0.176, lr=0.0001]\nSteps: 5%|▌ | 35/700 [00:15<04:49, 2.30it/s, loss=0.0212, lr=0.0001]\nSteps: 5%|▌ | 36/700 [00:15<04:48, 2.30it/s, loss=0.0212, lr=0.0001]\nSteps: 5%|▌ | 36/700 [00:15<04:48, 2.30it/s, loss=0.0399, lr=0.0001]\nSteps: 5%|▌ | 37/700 [00:16<04:47, 2.30it/s, loss=0.0399, lr=0.0001]\nSteps: 5%|▌ | 37/700 [00:16<04:47, 2.30it/s, loss=0.078, lr=0.0001] \nSteps: 5%|▌ | 38/700 [00:16<04:47, 2.30it/s, loss=0.078, lr=0.0001]\nSteps: 5%|▌ | 38/700 [00:16<04:47, 2.30it/s, loss=0.208, lr=0.0001]\nSteps: 6%|▌ | 39/700 [00:17<04:46, 2.31it/s, loss=0.208, lr=0.0001]\nSteps: 6%|▌ | 39/700 [00:17<04:46, 2.31it/s, loss=0.212, lr=0.0001]\nSteps: 6%|▌ | 40/700 [00:17<04:46, 2.31it/s, loss=0.212, lr=0.0001]\nSteps: 6%|▌ | 40/700 [00:17<04:46, 2.31it/s, loss=0.119, lr=0.0001]\nSteps: 6%|▌ | 41/700 [00:17<04:45, 2.31it/s, loss=0.119, lr=0.0001]\nSteps: 6%|▌ | 41/700 [00:18<04:45, 2.31it/s, loss=0.186, lr=0.0001]\nSteps: 6%|▌ | 42/700 [00:18<04:45, 2.31it/s, loss=0.186, lr=0.0001]\nSteps: 6%|▌ | 42/700 [00:18<04:45, 2.31it/s, loss=0.0453, lr=0.0001]\nSteps: 6%|▌ | 43/700 [00:18<04:44, 2.31it/s, loss=0.0453, lr=0.0001]\nSteps: 6%|▌ | 43/700 [00:18<04:44, 2.31it/s, loss=0.125, lr=0.0001] \nSteps: 6%|▋ | 44/700 [00:19<04:44, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 6%|▋ | 44/700 [00:19<04:44, 2.31it/s, loss=0.299, lr=0.0001]\nSteps: 6%|▋ | 45/700 [00:19<04:43, 2.31it/s, loss=0.299, lr=0.0001]\nSteps: 6%|▋ | 45/700 [00:19<04:43, 2.31it/s, loss=0.0874, lr=0.0001]\nSteps: 7%|▋ | 46/700 [00:20<04:43, 2.31it/s, loss=0.0874, lr=0.0001]\nSteps: 7%|▋ | 46/700 [00:20<04:43, 2.31it/s, loss=0.178, lr=0.0001] \nSteps: 7%|▋ | 47/700 [00:20<04:43, 2.31it/s, loss=0.178, lr=0.0001]\nSteps: 7%|▋ | 47/700 [00:20<04:43, 2.31it/s, loss=0.166, lr=0.0001]\nSteps: 7%|▋ | 48/700 [00:21<04:42, 2.31it/s, loss=0.166, lr=0.0001]\nSteps: 7%|▋ | 48/700 [00:21<04:42, 2.31it/s, loss=0.0528, lr=0.0001]\nSteps: 7%|▋ | 49/700 [00:21<04:43, 2.30it/s, loss=0.0528, lr=0.0001]\nSteps: 7%|▋ | 49/700 [00:21<04:43, 2.30it/s, loss=0.159, lr=0.0001] \nSteps: 7%|▋ | 50/700 [00:21<04:42, 2.30it/s, loss=0.159, lr=0.0001]\nSteps: 7%|▋ | 50/700 [00:21<04:42, 2.30it/s, loss=0.103, lr=0.0001]\nSteps: 7%|▋ | 51/700 [00:22<04:41, 2.30it/s, loss=0.103, lr=0.0001]\nSteps: 7%|▋ | 51/700 [00:22<04:41, 2.30it/s, loss=0.034, lr=0.0001]\nSteps: 7%|▋ | 52/700 [00:22<04:41, 2.30it/s, loss=0.034, lr=0.0001]\nSteps: 7%|▋ | 52/700 [00:22<04:41, 2.30it/s, loss=0.0843, lr=0.0001]\nSteps: 8%|▊ | 53/700 [00:23<04:40, 2.31it/s, loss=0.0843, lr=0.0001]\nSteps: 8%|▊ | 53/700 [00:23<04:40, 2.31it/s, loss=0.163, lr=0.0001] \nSteps: 8%|▊ | 54/700 [00:23<04:40, 2.31it/s, loss=0.163, lr=0.0001]\nSteps: 8%|▊ | 54/700 [00:23<04:40, 2.31it/s, loss=0.202, lr=0.0001]\nSteps: 8%|▊ | 55/700 [00:24<04:40, 2.30it/s, loss=0.202, lr=0.0001]\nSteps: 8%|▊ | 55/700 [00:24<04:40, 2.30it/s, loss=0.178, lr=0.0001]\nSteps: 8%|▊ | 56/700 [00:24<04:39, 2.31it/s, loss=0.178, lr=0.0001]\nSteps: 8%|▊ | 56/700 [00:24<04:39, 2.31it/s, loss=0.215, lr=0.0001]\nSteps: 8%|▊ | 57/700 [00:24<04:38, 2.31it/s, loss=0.215, lr=0.0001]\nSteps: 8%|▊ | 57/700 [00:24<04:38, 2.31it/s, loss=0.0982, lr=0.0001]\nSteps: 8%|▊ | 58/700 [00:25<04:38, 2.31it/s, loss=0.0982, lr=0.0001]\nSteps: 8%|▊ | 58/700 [00:25<04:38, 2.31it/s, loss=0.143, lr=0.0001] \nSteps: 8%|▊ | 59/700 [00:25<04:37, 2.31it/s, loss=0.143, lr=0.0001]\nSteps: 8%|▊ | 59/700 [00:25<04:37, 2.31it/s, loss=0.156, lr=0.0001]\nSteps: 9%|▊ | 60/700 [00:26<04:37, 2.31it/s, loss=0.156, lr=0.0001]\nSteps: 9%|▊ | 60/700 [00:26<04:37, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 9%|▊ | 61/700 [00:26<04:36, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 9%|▊ | 61/700 [00:26<04:36, 2.31it/s, loss=0.168, lr=0.0001]\nSteps: 9%|▉ | 62/700 [00:27<04:36, 2.31it/s, loss=0.168, lr=0.0001]\nSteps: 9%|▉ | 62/700 [00:27<04:36, 2.31it/s, loss=0.098, lr=0.0001]\nSteps: 9%|▉ | 63/700 [00:27<04:36, 2.31it/s, loss=0.098, lr=0.0001]\nSteps: 9%|▉ | 63/700 [00:27<04:36, 2.31it/s, loss=0.16, lr=0.0001] \nSteps: 9%|▉ | 64/700 [00:27<04:35, 2.31it/s, loss=0.16, lr=0.0001]\nSteps: 9%|▉ | 64/700 [00:27<04:35, 2.31it/s, loss=0.0913, lr=0.0001]\nSteps: 9%|▉ | 65/700 [00:28<04:36, 2.30it/s, loss=0.0913, lr=0.0001]\nSteps: 9%|▉ | 65/700 [00:28<04:36, 2.30it/s, loss=0.232, lr=0.0001] \nSteps: 9%|▉ | 66/700 [00:28<04:36, 2.29it/s, loss=0.232, lr=0.0001]\nSteps: 9%|▉ | 66/700 [00:28<04:36, 2.29it/s, loss=0.204, lr=0.0001]\nSteps: 10%|▉ | 67/700 [00:29<04:35, 2.30it/s, loss=0.204, lr=0.0001]\nSteps: 10%|▉ | 67/700 [00:29<04:35, 2.30it/s, loss=0.0839, lr=0.0001]\nSteps: 10%|▉ | 68/700 [00:29<04:34, 2.30it/s, loss=0.0839, lr=0.0001]\nSteps: 10%|▉ | 68/700 [00:29<04:34, 2.30it/s, loss=0.163, lr=0.0001] \nSteps: 10%|▉ | 69/700 [00:30<04:33, 2.30it/s, loss=0.163, lr=0.0001]\nSteps: 10%|▉ | 69/700 [00:30<04:33, 2.30it/s, loss=0.117, lr=0.0001]\nSteps: 10%|█ | 70/700 [00:30<04:33, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 10%|█ | 70/700 [00:30<04:33, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 10%|█ | 71/700 [00:30<04:32, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 10%|█ | 71/700 [00:31<04:32, 2.31it/s, loss=0.273, lr=0.0001]\nSteps: 10%|█ | 72/700 [00:31<04:32, 2.31it/s, loss=0.273, lr=0.0001]\nSteps: 10%|█ | 72/700 [00:31<04:32, 2.31it/s, loss=0.2, lr=0.0001] \nSteps: 10%|█ | 73/700 [00:31<04:31, 2.31it/s, loss=0.2, lr=0.0001]\nSteps: 10%|█ | 73/700 [00:31<04:31, 2.31it/s, loss=0.189, lr=0.0001]\nSteps: 11%|█ | 74/700 [00:32<04:31, 2.31it/s, loss=0.189, lr=0.0001]\nSteps: 11%|█ | 74/700 [00:32<04:31, 2.31it/s, loss=0.201, lr=0.0001]\nSteps: 11%|█ | 75/700 [00:32<04:30, 2.31it/s, loss=0.201, lr=0.0001]\nSteps: 11%|█ | 75/700 [00:32<04:30, 2.31it/s, loss=0.13, lr=0.0001] \nSteps: 11%|█ | 76/700 [00:33<04:30, 2.31it/s, loss=0.13, lr=0.0001]\nSteps: 11%|█ | 76/700 [00:33<04:30, 2.31it/s, loss=0.128, lr=0.0001]\nSteps: 11%|█ | 77/700 [00:33<04:29, 2.31it/s, loss=0.128, lr=0.0001]\nSteps: 11%|█ | 77/700 [00:33<04:29, 2.31it/s, loss=0.19, lr=0.0001] \nSteps: 11%|█ | 78/700 [00:34<04:29, 2.31it/s, loss=0.19, lr=0.0001]\nSteps: 11%|█ | 78/700 [00:34<04:29, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 11%|█▏ | 79/700 [00:34<04:28, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 11%|█▏ | 79/700 [00:34<04:28, 2.31it/s, loss=0.0576, lr=0.0001]\nSteps: 11%|█▏ | 80/700 [00:34<04:28, 2.31it/s, loss=0.0576, lr=0.0001]\nSteps: 11%|█▏ | 80/700 [00:34<04:28, 2.31it/s, loss=0.0391, lr=0.0001]\nSteps: 12%|█▏ | 81/700 [00:35<04:29, 2.30it/s, loss=0.0391, lr=0.0001]\nSteps: 12%|█▏ | 81/700 [00:35<04:29, 2.30it/s, loss=0.157, lr=0.0001] \nSteps: 12%|█▏ | 82/700 [00:35<04:28, 2.30it/s, loss=0.157, lr=0.0001]\nSteps: 12%|█▏ | 82/700 [00:35<04:28, 2.30it/s, loss=0.0326, lr=0.0001]\nSteps: 12%|█▏ | 83/700 [00:36<04:27, 2.30it/s, loss=0.0326, lr=0.0001]\nSteps: 12%|█▏ | 83/700 [00:36<04:27, 2.30it/s, loss=0.0692, lr=0.0001]\nSteps: 12%|█▏ | 84/700 [00:36<04:27, 2.30it/s, loss=0.0692, lr=0.0001]\nSteps: 12%|█▏ | 84/700 [00:36<04:27, 2.30it/s, loss=0.175, lr=0.0001] \nSteps: 12%|█▏ | 85/700 [00:37<04:26, 2.31it/s, loss=0.175, lr=0.0001]\nSteps: 12%|█▏ | 85/700 [00:37<04:26, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 12%|█▏ | 86/700 [00:37<04:26, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 12%|█▏ | 86/700 [00:37<04:26, 2.31it/s, loss=0.137, lr=0.0001]\nSteps: 12%|█▏ | 87/700 [00:37<04:26, 2.30it/s, loss=0.137, lr=0.0001]\nSteps: 12%|█▏ | 87/700 [00:37<04:26, 2.30it/s, loss=0.0814, lr=0.0001]\nSteps: 13%|█▎ | 88/700 [00:38<04:25, 2.30it/s, loss=0.0814, lr=0.0001]\nSteps: 13%|█▎ | 88/700 [00:38<04:25, 2.30it/s, loss=0.29, lr=0.0001] \nSteps: 13%|█▎ | 89/700 [00:38<04:25, 2.31it/s, loss=0.29, lr=0.0001]\nSteps: 13%|█▎ | 89/700 [00:38<04:25, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 13%|█▎ | 90/700 [00:39<04:24, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 13%|█▎ | 90/700 [00:39<04:24, 2.31it/s, loss=0.0188, lr=0.0001]\nSteps: 13%|█▎ | 91/700 [00:39<04:24, 2.31it/s, loss=0.0188, lr=0.0001]\nSteps: 13%|█▎ | 91/700 [00:39<04:24, 2.31it/s, loss=0.146, lr=0.0001] \nSteps: 13%|█▎ | 92/700 [00:40<04:23, 2.31it/s, loss=0.146, lr=0.0001]\nSteps: 13%|█▎ | 92/700 [00:40<04:23, 2.31it/s, loss=0.0699, lr=0.0001]\nSteps: 13%|█▎ | 93/700 [00:40<04:22, 2.31it/s, loss=0.0699, lr=0.0001]\nSteps: 13%|█▎ | 93/700 [00:40<04:22, 2.31it/s, loss=0.0927, lr=0.0001]\nSteps: 13%|█▎ | 94/700 [00:40<04:22, 2.31it/s, loss=0.0927, lr=0.0001]\nSteps: 13%|█▎ | 94/700 [00:40<04:22, 2.31it/s, loss=0.147, lr=0.0001] \nSteps: 14%|█▎ | 95/700 [00:41<04:21, 2.31it/s, loss=0.147, lr=0.0001]\nSteps: 14%|█▎ | 95/700 [00:41<04:21, 2.31it/s, loss=0.0597, lr=0.0001]\nSteps: 14%|█▎ | 96/700 [00:41<04:21, 2.31it/s, loss=0.0597, lr=0.0001]\nSteps: 14%|█▎ | 96/700 [00:41<04:21, 2.31it/s, loss=0.107, lr=0.0001] \nSteps: 14%|█▍ | 97/700 [00:42<04:22, 2.30it/s, loss=0.107, lr=0.0001]\nSteps: 14%|█▍ | 97/700 [00:42<04:22, 2.30it/s, loss=0.103, lr=0.0001]\nSteps: 14%|█▍ | 98/700 [00:42<04:21, 2.30it/s, loss=0.103, lr=0.0001]\nSteps: 14%|█▍ | 98/700 [00:42<04:21, 2.30it/s, loss=0.127, lr=0.0001]\nSteps: 14%|█▍ | 99/700 [00:43<04:21, 2.30it/s, loss=0.127, lr=0.0001]\nSteps: 14%|█▍ | 99/700 [00:43<04:21, 2.30it/s, loss=0.0597, lr=0.0001]\nSteps: 14%|█▍ | 100/700 [00:43<04:21, 2.30it/s, loss=0.0597, lr=0.0001]\nSteps: 14%|█▍ | 100/700 [00:43<04:21, 2.30it/s, loss=0.0843, lr=0.0001]\nSteps: 14%|█▍ | 101/700 [00:44<04:20, 2.30it/s, loss=0.0843, lr=0.0001]\nSteps: 14%|█▍ | 101/700 [00:44<04:20, 2.30it/s, loss=0.0791, lr=0.0001]\nSteps: 15%|█▍ | 102/700 [00:44<04:19, 2.30it/s, loss=0.0791, lr=0.0001]\nSteps: 15%|█▍ | 102/700 [00:44<04:19, 2.30it/s, loss=0.0923, lr=0.0001]\nSteps: 15%|█▍ | 103/700 [00:44<04:19, 2.30it/s, loss=0.0923, lr=0.0001]\nSteps: 15%|█▍ | 103/700 [00:44<04:19, 2.30it/s, loss=0.159, lr=0.0001] \nSteps: 15%|█▍ | 104/700 [00:45<04:18, 2.30it/s, loss=0.159, lr=0.0001]\nSteps: 15%|█▍ | 104/700 [00:45<04:18, 2.30it/s, loss=0.304, lr=0.0001]\nSteps: 15%|█▌ | 105/700 [00:45<04:18, 2.30it/s, loss=0.304, lr=0.0001]\nSteps: 15%|█▌ | 105/700 [00:45<04:18, 2.30it/s, loss=0.0677, lr=0.0001]\nSteps: 15%|█▌ | 106/700 [00:46<04:17, 2.31it/s, loss=0.0677, lr=0.0001]\nSteps: 15%|█▌ | 106/700 [00:46<04:17, 2.31it/s, loss=0.102, lr=0.0001] \nSteps: 15%|█▌ | 107/700 [00:46<04:17, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 15%|█▌ | 107/700 [00:46<04:17, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 15%|█▌ | 108/700 [00:47<04:16, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 15%|█▌ | 108/700 [00:47<04:16, 2.31it/s, loss=0.131, lr=0.0001]\nSteps: 16%|█▌ | 109/700 [00:47<04:16, 2.31it/s, loss=0.131, lr=0.0001]\nSteps: 16%|█▌ | 109/700 [00:47<04:16, 2.31it/s, loss=0.0958, lr=0.0001]\nSteps: 16%|█▌ | 110/700 [00:47<04:15, 2.31it/s, loss=0.0958, lr=0.0001]\nSteps: 16%|█▌ | 110/700 [00:47<04:15, 2.31it/s, loss=0.244, lr=0.0001] \nSteps: 16%|█▌ | 111/700 [00:48<04:15, 2.31it/s, loss=0.244, lr=0.0001]\nSteps: 16%|█▌ | 111/700 [00:48<04:15, 2.31it/s, loss=0.278, lr=0.0001]\nSteps: 16%|█▌ | 112/700 [00:48<04:14, 2.31it/s, loss=0.278, lr=0.0001]\nSteps: 16%|█▌ | 112/700 [00:48<04:14, 2.31it/s, loss=0.1, lr=0.0001] \nSteps: 16%|█▌ | 113/700 [00:49<04:15, 2.30it/s, loss=0.1, lr=0.0001]\nSteps: 16%|█▌ | 113/700 [00:49<04:15, 2.30it/s, loss=0.133, lr=0.0001]\nSteps: 16%|█▋ | 114/700 [00:49<04:14, 2.30it/s, loss=0.133, lr=0.0001]\nSteps: 16%|█▋ | 114/700 [00:49<04:14, 2.30it/s, loss=0.253, lr=0.0001]\nSteps: 16%|█▋ | 115/700 [00:50<04:14, 2.30it/s, loss=0.253, lr=0.0001]\nSteps: 16%|█▋ | 115/700 [00:50<04:14, 2.30it/s, loss=0.114, lr=0.0001]\nSteps: 17%|█▋ | 116/700 [00:50<04:13, 2.30it/s, loss=0.114, lr=0.0001]\nSteps: 17%|█▋ | 116/700 [00:50<04:13, 2.30it/s, loss=0.154, lr=0.0001]\nSteps: 17%|█▋ | 117/700 [00:50<04:14, 2.29it/s, loss=0.154, lr=0.0001]\nSteps: 17%|█▋ | 117/700 [00:50<04:14, 2.29it/s, loss=0.202, lr=0.0001]\nSteps: 17%|█▋ | 118/700 [00:51<04:14, 2.29it/s, loss=0.202, lr=0.0001]\nSteps: 17%|█▋ | 118/700 [00:51<04:14, 2.29it/s, loss=0.0992, lr=0.0001]\nSteps: 17%|█▋ | 119/700 [00:51<04:13, 2.29it/s, loss=0.0992, lr=0.0001]\nSteps: 17%|█▋ | 119/700 [00:51<04:13, 2.29it/s, loss=0.166, lr=0.0001] \nSteps: 17%|█▋ | 120/700 [00:52<04:12, 2.30it/s, loss=0.166, lr=0.0001]\nSteps: 17%|█▋ | 120/700 [00:52<04:12, 2.30it/s, loss=0.124, lr=0.0001]\nSteps: 17%|█▋ | 121/700 [00:52<04:11, 2.30it/s, loss=0.124, lr=0.0001]\nSteps: 17%|█▋ | 121/700 [00:52<04:11, 2.30it/s, loss=0.0382, lr=0.0001]\nSteps: 17%|█▋ | 122/700 [00:53<04:11, 2.29it/s, loss=0.0382, lr=0.0001]\nSteps: 17%|█▋ | 122/700 [00:53<04:11, 2.29it/s, loss=0.0882, lr=0.0001]\nSteps: 18%|█▊ | 123/700 [00:53<04:11, 2.30it/s, loss=0.0882, lr=0.0001]\nSteps: 18%|█▊ | 123/700 [00:53<04:11, 2.30it/s, loss=0.0856, lr=0.0001]\nSteps: 18%|█▊ | 124/700 [00:54<04:10, 2.30it/s, loss=0.0856, lr=0.0001]\nSteps: 18%|█▊ | 124/700 [00:54<04:10, 2.30it/s, loss=0.145, lr=0.0001] \nSteps: 18%|█▊ | 125/700 [00:54<04:10, 2.29it/s, loss=0.145, lr=0.0001]\nSteps: 18%|█▊ | 125/700 [00:54<04:10, 2.29it/s, loss=0.14, lr=0.0001] \nSteps: 18%|█▊ | 126/700 [00:54<04:09, 2.30it/s, loss=0.14, lr=0.0001]\nSteps: 18%|█▊ | 126/700 [00:54<04:09, 2.30it/s, loss=0.194, lr=0.0001]\nSteps: 18%|█▊ | 127/700 [00:55<04:08, 2.31it/s, loss=0.194, lr=0.0001]\nSteps: 18%|█▊ | 127/700 [00:55<04:08, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 18%|█▊ | 128/700 [00:55<04:07, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 18%|█▊ | 128/700 [00:55<04:07, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 18%|█▊ | 129/700 [00:56<04:08, 2.30it/s, loss=0.106, lr=0.0001]\nSteps: 18%|█▊ | 129/700 [00:56<04:08, 2.30it/s, loss=0.138, lr=0.0001]\nSteps: 19%|█▊ | 130/700 [00:56<04:07, 2.30it/s, loss=0.138, lr=0.0001]\nSteps: 19%|█▊ | 130/700 [00:56<04:07, 2.30it/s, loss=0.229, lr=0.0001]\nSteps: 19%|█▊ | 131/700 [00:57<04:07, 2.30it/s, loss=0.229, lr=0.0001]\nSteps: 19%|█▊ | 131/700 [00:57<04:07, 2.30it/s, loss=0.125, lr=0.0001]\nSteps: 19%|█▉ | 132/700 [00:57<04:06, 2.30it/s, loss=0.125, lr=0.0001]\nSteps: 19%|█▉ | 132/700 [00:57<04:06, 2.30it/s, loss=0.251, lr=0.0001]\nSteps: 19%|█▉ | 133/700 [00:57<04:06, 2.30it/s, loss=0.251, lr=0.0001]\nSteps: 19%|█▉ | 133/700 [00:57<04:06, 2.30it/s, loss=0.111, lr=0.0001]\nSteps: 19%|█▉ | 134/700 [00:58<04:05, 2.30it/s, loss=0.111, lr=0.0001]\nSteps: 19%|█▉ | 134/700 [00:58<04:05, 2.30it/s, loss=0.0731, lr=0.0001]\nSteps: 19%|█▉ | 135/700 [00:58<04:05, 2.30it/s, loss=0.0731, lr=0.0001]\nSteps: 19%|█▉ | 135/700 [00:58<04:05, 2.30it/s, loss=0.146, lr=0.0001] \nSteps: 19%|█▉ | 136/700 [00:59<04:05, 2.30it/s, loss=0.146, lr=0.0001]\nSteps: 19%|█▉ | 136/700 [00:59<04:05, 2.30it/s, loss=0.0851, lr=0.0001]\nSteps: 20%|█▉ | 137/700 [00:59<04:04, 2.30it/s, loss=0.0851, lr=0.0001]\nSteps: 20%|█▉ | 137/700 [00:59<04:04, 2.30it/s, loss=0.245, lr=0.0001] \nSteps: 20%|█▉ | 138/700 [01:00<04:03, 2.31it/s, loss=0.245, lr=0.0001]\nSteps: 20%|█▉ | 138/700 [01:00<04:03, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 20%|█▉ | 139/700 [01:00<04:03, 2.30it/s, loss=0.113, lr=0.0001]\nSteps: 20%|█▉ | 139/700 [01:00<04:03, 2.30it/s, loss=0.158, lr=0.0001]\nSteps: 20%|██ | 140/700 [01:00<04:02, 2.31it/s, loss=0.158, lr=0.0001]\nSteps: 20%|██ | 140/700 [01:00<04:02, 2.31it/s, loss=0.0694, lr=0.0001]\nSteps: 20%|██ | 141/700 [01:01<04:02, 2.31it/s, loss=0.0694, lr=0.0001]\nSteps: 20%|██ | 141/700 [01:01<04:02, 2.31it/s, loss=0.0592, lr=0.0001]\nSteps: 20%|██ | 142/700 [01:01<04:02, 2.31it/s, loss=0.0592, lr=0.0001]\nSteps: 20%|██ | 142/700 [01:01<04:02, 2.31it/s, loss=0.0842, lr=0.0001]\nSteps: 20%|██ | 143/700 [01:02<04:01, 2.31it/s, loss=0.0842, lr=0.0001]\nSteps: 20%|██ | 143/700 [01:02<04:01, 2.31it/s, loss=0.286, lr=0.0001] \nSteps: 21%|██ | 144/700 [01:02<04:00, 2.31it/s, loss=0.286, lr=0.0001]\nSteps: 21%|██ | 144/700 [01:02<04:00, 2.31it/s, loss=0.153, lr=0.0001]\nSteps: 21%|██ | 145/700 [01:03<04:01, 2.30it/s, loss=0.153, lr=0.0001]\nSteps: 21%|██ | 145/700 [01:03<04:01, 2.30it/s, loss=0.128, lr=0.0001]\nSteps: 21%|██ | 146/700 [01:03<04:00, 2.30it/s, loss=0.128, lr=0.0001]\nSteps: 21%|██ | 146/700 [01:03<04:00, 2.30it/s, loss=0.135, lr=0.0001]\nSteps: 21%|██ | 147/700 [01:03<03:59, 2.30it/s, loss=0.135, lr=0.0001]\nSteps: 21%|██ | 147/700 [01:04<03:59, 2.30it/s, loss=0.133, lr=0.0001]\nSteps: 21%|██ | 148/700 [01:04<03:59, 2.31it/s, loss=0.133, lr=0.0001]\nSteps: 21%|██ | 148/700 [01:04<03:59, 2.31it/s, loss=0.139, lr=0.0001]\nSteps: 21%|██▏ | 149/700 [01:04<03:58, 2.31it/s, loss=0.139, lr=0.0001]\nSteps: 21%|██▏ | 149/700 [01:04<03:58, 2.31it/s, loss=0.0741, lr=0.0001]\nSteps: 21%|██▏ | 150/700 [01:05<03:58, 2.31it/s, loss=0.0741, lr=0.0001]\nSteps: 21%|██▏ | 150/700 [01:05<03:58, 2.31it/s, loss=0.26, lr=0.0001] \nSteps: 22%|██▏ | 151/700 [01:05<03:57, 2.31it/s, loss=0.26, lr=0.0001]\nSteps: 22%|██▏ | 151/700 [01:05<03:57, 2.31it/s, loss=0.14, lr=0.0001]\nSteps: 22%|██▏ | 152/700 [01:06<03:57, 2.31it/s, loss=0.14, lr=0.0001]\nSteps: 22%|██▏ | 152/700 [01:06<03:57, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 22%|██▏ | 153/700 [01:06<03:56, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 22%|██▏ | 153/700 [01:06<03:56, 2.31it/s, loss=0.119, lr=0.0001]\nSteps: 22%|██▏ | 154/700 [01:07<03:56, 2.31it/s, loss=0.119, lr=0.0001]\nSteps: 22%|██▏ | 154/700 [01:07<03:56, 2.31it/s, loss=0.0301, lr=0.0001]\nSteps: 22%|██▏ | 155/700 [01:07<03:55, 2.31it/s, loss=0.0301, lr=0.0001]\nSteps: 22%|██▏ | 155/700 [01:07<03:55, 2.31it/s, loss=0.147, lr=0.0001] \nSteps: 22%|██▏ | 156/700 [01:07<03:55, 2.31it/s, loss=0.147, lr=0.0001]\nSteps: 22%|██▏ | 156/700 [01:07<03:55, 2.31it/s, loss=0.246, lr=0.0001]\nSteps: 22%|██▏ | 157/700 [01:08<03:55, 2.31it/s, loss=0.246, lr=0.0001]\nSteps: 22%|██▏ | 157/700 [01:08<03:55, 2.31it/s, loss=0.281, lr=0.0001]\nSteps: 23%|██▎ | 158/700 [01:08<03:54, 2.31it/s, loss=0.281, lr=0.0001]\nSteps: 23%|██▎ | 158/700 [01:08<03:54, 2.31it/s, loss=0.114, lr=0.0001]\nSteps: 23%|██▎ | 159/700 [01:09<03:54, 2.31it/s, loss=0.114, lr=0.0001]\nSteps: 23%|██▎ | 159/700 [01:09<03:54, 2.31it/s, loss=0.0437, lr=0.0001]\nSteps: 23%|██▎ | 160/700 [01:09<03:53, 2.31it/s, loss=0.0437, lr=0.0001]\nSteps: 23%|██▎ | 160/700 [01:09<03:53, 2.31it/s, loss=0.0781, lr=0.0001]\nSteps: 23%|██▎ | 161/700 [01:10<03:54, 2.30it/s, loss=0.0781, lr=0.0001]\nSteps: 23%|██▎ | 161/700 [01:10<03:54, 2.30it/s, loss=0.0544, lr=0.0001]\nSteps: 23%|██▎ | 162/700 [01:10<03:53, 2.30it/s, loss=0.0544, lr=0.0001]\nSteps: 23%|██▎ | 162/700 [01:10<03:53, 2.30it/s, loss=0.199, lr=0.0001] \nSteps: 23%|██▎ | 163/700 [01:10<03:53, 2.30it/s, loss=0.199, lr=0.0001]\nSteps: 23%|██▎ | 163/700 [01:10<03:53, 2.30it/s, loss=0.164, lr=0.0001]\nSteps: 23%|██▎ | 164/700 [01:11<03:52, 2.31it/s, loss=0.164, lr=0.0001]\nSteps: 23%|██▎ | 164/700 [01:11<03:52, 2.31it/s, loss=0.0932, lr=0.0001]\nSteps: 24%|██▎ | 165/700 [01:11<03:51, 2.31it/s, loss=0.0932, lr=0.0001]\nSteps: 24%|██▎ | 165/700 [01:11<03:51, 2.31it/s, loss=0.116, lr=0.0001] \nSteps: 24%|██▎ | 166/700 [01:12<03:51, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 24%|██▎ | 166/700 [01:12<03:51, 2.31it/s, loss=0.0942, lr=0.0001]\nSteps: 24%|██▍ | 167/700 [01:12<03:50, 2.31it/s, loss=0.0942, lr=0.0001]\nSteps: 24%|██▍ | 167/700 [01:12<03:50, 2.31it/s, loss=0.105, lr=0.0001] \nSteps: 24%|██▍ | 168/700 [01:13<03:50, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 24%|██▍ | 168/700 [01:13<03:50, 2.31it/s, loss=0.141, lr=0.0001]\nSteps: 24%|██▍ | 169/700 [01:13<03:50, 2.31it/s, loss=0.141, lr=0.0001]\nSteps: 24%|██▍ | 169/700 [01:13<03:50, 2.31it/s, loss=0.146, lr=0.0001]\nSteps: 24%|██▍ | 170/700 [01:13<03:49, 2.31it/s, loss=0.146, lr=0.0001]\nSteps: 24%|██▍ | 170/700 [01:13<03:49, 2.31it/s, loss=0.0638, lr=0.0001]\nSteps: 24%|██▍ | 171/700 [01:14<03:49, 2.31it/s, loss=0.0638, lr=0.0001]\nSteps: 24%|██▍ | 171/700 [01:14<03:49, 2.31it/s, loss=0.16, lr=0.0001] \nSteps: 25%|██▍ | 172/700 [01:14<03:48, 2.31it/s, loss=0.16, lr=0.0001]\nSteps: 25%|██▍ | 172/700 [01:14<03:48, 2.31it/s, loss=0.215, lr=0.0001]\nSteps: 25%|██▍ | 173/700 [01:15<03:48, 2.31it/s, loss=0.215, lr=0.0001]\nSteps: 25%|██▍ | 173/700 [01:15<03:48, 2.31it/s, loss=0.21, lr=0.0001] \nSteps: 25%|██▍ | 174/700 [01:15<03:47, 2.31it/s, loss=0.21, lr=0.0001]\nSteps: 25%|██▍ | 174/700 [01:15<03:47, 2.31it/s, loss=0.174, lr=0.0001]\nSteps: 25%|██▌ | 175/700 [01:16<03:47, 2.31it/s, loss=0.174, lr=0.0001]\nSteps: 25%|██▌ | 175/700 [01:16<03:47, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 25%|██▌ | 176/700 [01:16<03:46, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 25%|██▌ | 176/700 [01:16<03:46, 2.31it/s, loss=0.169, lr=0.0001]\nSteps: 25%|██▌ | 177/700 [01:16<03:47, 2.30it/s, loss=0.169, lr=0.0001]\nSteps: 25%|██▌ | 177/700 [01:17<03:47, 2.30it/s, loss=0.0948, lr=0.0001]\nSteps: 25%|██▌ | 178/700 [01:17<03:46, 2.30it/s, loss=0.0948, lr=0.0001]\nSteps: 25%|██▌ | 178/700 [01:17<03:46, 2.30it/s, loss=0.275, lr=0.0001] \nSteps: 26%|██▌ | 179/700 [01:17<03:46, 2.30it/s, loss=0.275, lr=0.0001]\nSteps: 26%|██▌ | 179/700 [01:17<03:46, 2.30it/s, loss=0.109, lr=0.0001]\nSteps: 26%|██▌ | 180/700 [01:18<03:45, 2.31it/s, loss=0.109, lr=0.0001]\nSteps: 26%|██▌ | 180/700 [01:18<03:45, 2.31it/s, loss=0.0641, lr=0.0001]\nSteps: 26%|██▌ | 181/700 [01:18<03:45, 2.30it/s, loss=0.0641, lr=0.0001]\nSteps: 26%|██▌ | 181/700 [01:18<03:45, 2.30it/s, loss=0.245, lr=0.0001] \nSteps: 26%|██▌ | 182/700 [01:19<03:44, 2.31it/s, loss=0.245, lr=0.0001]\nSteps: 26%|██▌ | 182/700 [01:19<03:44, 2.31it/s, loss=0.133, lr=0.0001]\nSteps: 26%|██▌ | 183/700 [01:19<03:44, 2.31it/s, loss=0.133, lr=0.0001]\nSteps: 26%|██▌ | 183/700 [01:19<03:44, 2.31it/s, loss=0.0986, lr=0.0001]\nSteps: 26%|██▋ | 184/700 [01:20<03:43, 2.30it/s, loss=0.0986, lr=0.0001]\nSteps: 26%|██▋ | 184/700 [01:20<03:43, 2.30it/s, loss=0.152, lr=0.0001] \nSteps: 26%|██▋ | 185/700 [01:20<03:43, 2.31it/s, loss=0.152, lr=0.0001]\nSteps: 26%|██▋ | 185/700 [01:20<03:43, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 27%|██▋ | 186/700 [01:20<03:42, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 27%|██▋ | 186/700 [01:20<03:42, 2.31it/s, loss=0.172, lr=0.0001]\nSteps: 27%|██▋ | 187/700 [01:21<03:42, 2.31it/s, loss=0.172, lr=0.0001]\nSteps: 27%|██▋ | 187/700 [01:21<03:42, 2.31it/s, loss=0.31, lr=0.0001] \nSteps: 27%|██▋ | 188/700 [01:21<03:42, 2.30it/s, loss=0.31, lr=0.0001]\nSteps: 27%|██▋ | 188/700 [01:21<03:42, 2.30it/s, loss=0.124, lr=0.0001]\nSteps: 27%|██▋ | 189/700 [01:22<03:41, 2.30it/s, loss=0.124, lr=0.0001]\nSteps: 27%|██▋ | 189/700 [01:22<03:41, 2.30it/s, loss=0.049, lr=0.0001]\nSteps: 27%|██▋ | 190/700 [01:22<03:41, 2.30it/s, loss=0.049, lr=0.0001]\nSteps: 27%|██▋ | 190/700 [01:22<03:41, 2.30it/s, loss=0.0852, lr=0.0001]\nSteps: 27%|██▋ | 191/700 [01:23<03:41, 2.30it/s, loss=0.0852, lr=0.0001]\nSteps: 27%|██▋ | 191/700 [01:23<03:41, 2.30it/s, loss=0.0649, lr=0.0001]\nSteps: 27%|██▋ | 192/700 [01:23<03:40, 2.31it/s, loss=0.0649, lr=0.0001]\nSteps: 27%|██▋ | 192/700 [01:23<03:40, 2.31it/s, loss=0.0476, lr=0.0001]\nSteps: 28%|██▊ | 193/700 [01:23<03:41, 2.29it/s, loss=0.0476, lr=0.0001]\nSteps: 28%|██▊ | 193/700 [01:23<03:41, 2.29it/s, loss=0.0807, lr=0.0001]\nSteps: 28%|██▊ | 194/700 [01:24<03:40, 2.29it/s, loss=0.0807, lr=0.0001]\nSteps: 28%|██▊ | 194/700 [01:24<03:40, 2.29it/s, loss=0.207, lr=0.0001] \nSteps: 28%|██▊ | 195/700 [01:24<03:39, 2.30it/s, loss=0.207, lr=0.0001]\nSteps: 28%|██▊ | 195/700 [01:24<03:39, 2.30it/s, loss=0.153, lr=0.0001]\nSteps: 28%|██▊ | 196/700 [01:25<03:38, 2.30it/s, loss=0.153, lr=0.0001]\nSteps: 28%|██▊ | 196/700 [01:25<03:38, 2.30it/s, loss=0.0468, lr=0.0001]\nSteps: 28%|██▊ | 197/700 [01:25<03:38, 2.31it/s, loss=0.0468, lr=0.0001]\nSteps: 28%|██▊ | 197/700 [01:25<03:38, 2.31it/s, loss=0.194, lr=0.0001] \nSteps: 28%|██▊ | 198/700 [01:26<03:37, 2.31it/s, loss=0.194, lr=0.0001]\nSteps: 28%|██▊ | 198/700 [01:26<03:37, 2.31it/s, loss=0.341, lr=0.0001]\nSteps: 28%|██▊ | 199/700 [01:26<03:37, 2.31it/s, loss=0.341, lr=0.0001]\nSteps: 28%|██▊ | 199/700 [01:26<03:37, 2.31it/s, loss=0.0981, lr=0.0001]\nSteps: 29%|██▊ | 200/700 [01:26<03:36, 2.31it/s, loss=0.0981, lr=0.0001]\nSteps: 29%|██▊ | 200/700 [01:27<03:36, 2.31it/s, loss=0.193, lr=0.0001] \nSteps: 29%|██▊ | 201/700 [01:27<03:36, 2.30it/s, loss=0.193, lr=0.0001]\nSteps: 29%|██▊ | 201/700 [01:27<03:36, 2.30it/s, loss=0.0917, lr=0.0001]\nSteps: 29%|██▉ | 202/700 [01:27<03:35, 2.31it/s, loss=0.0917, lr=0.0001]\nSteps: 29%|██▉ | 202/700 [01:27<03:35, 2.31it/s, loss=0.149, lr=0.0001] \nSteps: 29%|██▉ | 203/700 [01:28<03:35, 2.31it/s, loss=0.149, lr=0.0001]\nSteps: 29%|██▉ | 203/700 [01:28<03:35, 2.31it/s, loss=0.0842, lr=0.0001]\nSteps: 29%|██▉ | 204/700 [01:28<03:34, 2.31it/s, loss=0.0842, lr=0.0001]\nSteps: 29%|██▉ | 204/700 [01:28<03:34, 2.31it/s, loss=0.27, lr=0.0001] \nSteps: 29%|██▉ | 205/700 [01:29<03:34, 2.31it/s, loss=0.27, lr=0.0001]\nSteps: 29%|██▉ | 205/700 [01:29<03:34, 2.31it/s, loss=0.234, lr=0.0001]\nSteps: 29%|██▉ | 206/700 [01:29<03:34, 2.31it/s, loss=0.234, lr=0.0001]\nSteps: 29%|██▉ | 206/700 [01:29<03:34, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 30%|██▉ | 207/700 [01:30<03:33, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 30%|██▉ | 207/700 [01:30<03:33, 2.31it/s, loss=0.0958, lr=0.0001]\nSteps: 30%|██▉ | 208/700 [01:30<03:33, 2.31it/s, loss=0.0958, lr=0.0001]\nSteps: 30%|██▉ | 208/700 [01:30<03:33, 2.31it/s, loss=0.0906, lr=0.0001]\nSteps: 30%|██▉ | 209/700 [01:30<03:33, 2.30it/s, loss=0.0906, lr=0.0001]\nSteps: 30%|██▉ | 209/700 [01:30<03:33, 2.30it/s, loss=0.0941, lr=0.0001]\nSteps: 30%|███ | 210/700 [01:31<03:32, 2.30it/s, loss=0.0941, lr=0.0001]\nSteps: 30%|███ | 210/700 [01:31<03:32, 2.30it/s, loss=0.0909, lr=0.0001]\nSteps: 30%|███ | 211/700 [01:31<03:32, 2.30it/s, loss=0.0909, lr=0.0001]\nSteps: 30%|███ | 211/700 [01:31<03:32, 2.30it/s, loss=0.126, lr=0.0001] \nSteps: 30%|███ | 212/700 [01:32<03:31, 2.30it/s, loss=0.126, lr=0.0001]\nSteps: 30%|███ | 212/700 [01:32<03:31, 2.30it/s, loss=0.148, lr=0.0001]\nSteps: 30%|███ | 213/700 [01:32<03:31, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 30%|███ | 213/700 [01:32<03:31, 2.31it/s, loss=0.259, lr=0.0001]\nSteps: 31%|███ | 214/700 [01:33<03:30, 2.31it/s, loss=0.259, lr=0.0001]\nSteps: 31%|███ | 214/700 [01:33<03:30, 2.31it/s, loss=0.233, lr=0.0001]\nSteps: 31%|███ | 215/700 [01:33<03:30, 2.31it/s, loss=0.233, lr=0.0001]\nSteps: 31%|███ | 215/700 [01:33<03:30, 2.31it/s, loss=0.0979, lr=0.0001]\nSteps: 31%|███ | 216/700 [01:33<03:29, 2.31it/s, loss=0.0979, lr=0.0001]\nSteps: 31%|███ | 216/700 [01:33<03:29, 2.31it/s, loss=0.167, lr=0.0001] \nSteps: 31%|███ | 217/700 [01:34<03:29, 2.31it/s, loss=0.167, lr=0.0001]\nSteps: 31%|███ | 217/700 [01:34<03:29, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 31%|███ | 218/700 [01:34<03:28, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 31%|███ | 218/700 [01:34<03:28, 2.31it/s, loss=0.112, lr=0.0001]\nSteps: 31%|███▏ | 219/700 [01:35<03:28, 2.31it/s, loss=0.112, lr=0.0001]\nSteps: 31%|███▏ | 219/700 [01:35<03:28, 2.31it/s, loss=0.0973, lr=0.0001]\nSteps: 31%|███▏ | 220/700 [01:35<03:27, 2.31it/s, loss=0.0973, lr=0.0001]\nSteps: 31%|███▏ | 220/700 [01:35<03:27, 2.31it/s, loss=0.113, lr=0.0001] \nSteps: 32%|███▏ | 221/700 [01:36<03:27, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 32%|███▏ | 221/700 [01:36<03:27, 2.31it/s, loss=0.094, lr=0.0001]\nSteps: 32%|███▏ | 222/700 [01:36<03:26, 2.31it/s, loss=0.094, lr=0.0001]\nSteps: 32%|███▏ | 222/700 [01:36<03:26, 2.31it/s, loss=0.141, lr=0.0001]\nSteps: 32%|███▏ | 223/700 [01:36<03:26, 2.31it/s, loss=0.141, lr=0.0001]\nSteps: 32%|███▏ | 223/700 [01:36<03:26, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 32%|███▏ | 224/700 [01:37<03:25, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 32%|███▏ | 224/700 [01:37<03:25, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 32%|███▏ | 225/700 [01:37<03:26, 2.30it/s, loss=0.105, lr=0.0001]\nSteps: 32%|███▏ | 225/700 [01:37<03:26, 2.30it/s, loss=0.255, lr=0.0001]\nSteps: 32%|███▏ | 226/700 [01:38<03:25, 2.30it/s, loss=0.255, lr=0.0001]\nSteps: 32%|███▏ | 226/700 [01:38<03:25, 2.30it/s, loss=0.189, lr=0.0001]\nSteps: 32%|███▏ | 227/700 [01:38<03:25, 2.30it/s, loss=0.189, lr=0.0001]\nSteps: 32%|███▏ | 227/700 [01:38<03:25, 2.30it/s, loss=0.117, lr=0.0001]\nSteps: 33%|███▎ | 228/700 [01:39<03:24, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 33%|███▎ | 228/700 [01:39<03:24, 2.31it/s, loss=0.0894, lr=0.0001]\nSteps: 33%|███▎ | 229/700 [01:39<03:24, 2.31it/s, loss=0.0894, lr=0.0001]\nSteps: 33%|███▎ | 229/700 [01:39<03:24, 2.31it/s, loss=0.107, lr=0.0001] \nSteps: 33%|███▎ | 230/700 [01:39<03:23, 2.31it/s, loss=0.107, lr=0.0001]\nSteps: 33%|███▎ | 230/700 [01:40<03:23, 2.31it/s, loss=0.0873, lr=0.0001]\nSteps: 33%|███▎ | 231/700 [01:40<03:23, 2.31it/s, loss=0.0873, lr=0.0001]\nSteps: 33%|███▎ | 231/700 [01:40<03:23, 2.31it/s, loss=0.0671, lr=0.0001]\nSteps: 33%|███▎ | 232/700 [01:40<03:22, 2.31it/s, loss=0.0671, lr=0.0001]\nSteps: 33%|███▎ | 232/700 [01:40<03:22, 2.31it/s, loss=0.094, lr=0.0001] \nSteps: 33%|███▎ | 233/700 [01:41<03:22, 2.31it/s, loss=0.094, lr=0.0001]\nSteps: 33%|███▎ | 233/700 [01:41<03:22, 2.31it/s, loss=0.124, lr=0.0001]\nSteps: 33%|███▎ | 234/700 [01:41<03:21, 2.31it/s, loss=0.124, lr=0.0001]\nSteps: 33%|███▎ | 234/700 [01:41<03:21, 2.31it/s, loss=0.0847, lr=0.0001]\nSteps: 34%|███▎ | 235/700 [01:42<03:21, 2.31it/s, loss=0.0847, lr=0.0001]\nSteps: 34%|███▎ | 235/700 [01:42<03:21, 2.31it/s, loss=0.236, lr=0.0001] \nSteps: 34%|███▎ | 236/700 [01:42<03:20, 2.31it/s, loss=0.236, lr=0.0001]\nSteps: 34%|███▎ | 236/700 [01:42<03:20, 2.31it/s, loss=0.0215, lr=0.0001]\nSteps: 34%|███▍ | 237/700 [01:43<03:20, 2.31it/s, loss=0.0215, lr=0.0001]\nSteps: 34%|███▍ | 237/700 [01:43<03:20, 2.31it/s, loss=0.0918, lr=0.0001]\nSteps: 34%|███▍ | 238/700 [01:43<03:19, 2.31it/s, loss=0.0918, lr=0.0001]\nSteps: 34%|███▍ | 238/700 [01:43<03:19, 2.31it/s, loss=0.152, lr=0.0001] \nSteps: 34%|███▍ | 239/700 [01:43<03:19, 2.31it/s, loss=0.152, lr=0.0001]\nSteps: 34%|███▍ | 239/700 [01:43<03:19, 2.31it/s, loss=0.0908, lr=0.0001]\nSteps: 34%|███▍ | 240/700 [01:44<03:18, 2.31it/s, loss=0.0908, lr=0.0001]\nSteps: 34%|███▍ | 240/700 [01:44<03:18, 2.31it/s, loss=0.0664, lr=0.0001]\nSteps: 34%|███▍ | 241/700 [01:44<03:19, 2.30it/s, loss=0.0664, lr=0.0001]\nSteps: 34%|███▍ | 241/700 [01:44<03:19, 2.30it/s, loss=0.0761, lr=0.0001]\nSteps: 35%|███▍ | 242/700 [01:45<03:18, 2.30it/s, loss=0.0761, lr=0.0001]\nSteps: 35%|███▍ | 242/700 [01:45<03:18, 2.30it/s, loss=0.0773, lr=0.0001]\nSteps: 35%|███▍ | 243/700 [01:45<03:18, 2.31it/s, loss=0.0773, lr=0.0001]\nSteps: 35%|███▍ | 243/700 [01:45<03:18, 2.31it/s, loss=0.127, lr=0.0001] \nSteps: 35%|███▍ | 244/700 [01:46<03:17, 2.31it/s, loss=0.127, lr=0.0001]\nSteps: 35%|███▍ | 244/700 [01:46<03:17, 2.31it/s, loss=0.16, lr=0.0001] \nSteps: 35%|███▌ | 245/700 [01:46<03:17, 2.31it/s, loss=0.16, lr=0.0001]\nSteps: 35%|███▌ | 245/700 [01:46<03:17, 2.31it/s, loss=0.0749, lr=0.0001]\nSteps: 35%|███▌ | 246/700 [01:46<03:16, 2.31it/s, loss=0.0749, lr=0.0001]\nSteps: 35%|███▌ | 246/700 [01:46<03:16, 2.31it/s, loss=0.143, lr=0.0001] \nSteps: 35%|███▌ | 247/700 [01:47<03:16, 2.31it/s, loss=0.143, lr=0.0001]\nSteps: 35%|███▌ | 247/700 [01:47<03:16, 2.31it/s, loss=0.221, lr=0.0001]\nSteps: 35%|███▌ | 248/700 [01:47<03:15, 2.31it/s, loss=0.221, lr=0.0001]\nSteps: 35%|███▌ | 248/700 [01:47<03:15, 2.31it/s, loss=0.0879, lr=0.0001]\nSteps: 36%|███▌ | 249/700 [01:48<03:15, 2.31it/s, loss=0.0879, lr=0.0001]\nSteps: 36%|███▌ | 249/700 [01:48<03:15, 2.31it/s, loss=0.0838, lr=0.0001]\nSteps: 36%|███▌ | 250/700 [01:48<03:14, 2.31it/s, loss=0.0838, lr=0.0001]\nSteps: 36%|███▌ | 250/700 [01:48<03:14, 2.31it/s, loss=0.166, lr=0.0001] \nSteps: 36%|███▌ | 251/700 [01:49<03:14, 2.31it/s, loss=0.166, lr=0.0001]\nSteps: 36%|███▌ | 251/700 [01:49<03:14, 2.31it/s, loss=0.156, lr=0.0001]\nSteps: 36%|███▌ | 252/700 [01:49<03:13, 2.31it/s, loss=0.156, lr=0.0001]\nSteps: 36%|███▌ | 252/700 [01:49<03:13, 2.31it/s, loss=0.256, lr=0.0001]\nSteps: 36%|███▌ | 253/700 [01:49<03:13, 2.31it/s, loss=0.256, lr=0.0001]\nSteps: 36%|███▌ | 253/700 [01:49<03:13, 2.31it/s, loss=0.044, lr=0.0001]\nSteps: 36%|███▋ | 254/700 [01:50<03:12, 2.31it/s, loss=0.044, lr=0.0001]\nSteps: 36%|███▋ | 254/700 [01:50<03:12, 2.31it/s, loss=0.182, lr=0.0001]\nSteps: 36%|███▋ | 255/700 [01:50<03:12, 2.31it/s, loss=0.182, lr=0.0001]\nSteps: 36%|███▋ | 255/700 [01:50<03:12, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 37%|███▋ | 256/700 [01:51<03:12, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 37%|███▋ | 256/700 [01:51<03:12, 2.31it/s, loss=0.151, lr=0.0001]\nSteps: 37%|███▋ | 257/700 [01:51<03:12, 2.30it/s, loss=0.151, lr=0.0001]\nSteps: 37%|███▋ | 257/700 [01:51<03:12, 2.30it/s, loss=0.0976, lr=0.0001]\nSteps: 37%|███▋ | 258/700 [01:52<03:11, 2.30it/s, loss=0.0976, lr=0.0001]\nSteps: 37%|███▋ | 258/700 [01:52<03:11, 2.30it/s, loss=0.193, lr=0.0001] \nSteps: 37%|███▋ | 259/700 [01:52<03:11, 2.31it/s, loss=0.193, lr=0.0001]\nSteps: 37%|███▋ | 259/700 [01:52<03:11, 2.31it/s, loss=0.0853, lr=0.0001]\nSteps: 37%|███▋ | 260/700 [01:52<03:10, 2.31it/s, loss=0.0853, lr=0.0001]\nSteps: 37%|███▋ | 260/700 [01:53<03:10, 2.31it/s, loss=0.201, lr=0.0001] \nSteps: 37%|███▋ | 261/700 [01:53<03:10, 2.31it/s, loss=0.201, lr=0.0001]\nSteps: 37%|███▋ | 261/700 [01:53<03:10, 2.31it/s, loss=0.191, lr=0.0001]\nSteps: 37%|███▋ | 262/700 [01:53<03:09, 2.31it/s, loss=0.191, lr=0.0001]\nSteps: 37%|███▋ | 262/700 [01:53<03:09, 2.31it/s, loss=0.0494, lr=0.0001]\nSteps: 38%|███▊ | 263/700 [01:54<03:09, 2.31it/s, loss=0.0494, lr=0.0001]\nSteps: 38%|███▊ | 263/700 [01:54<03:09, 2.31it/s, loss=0.0995, lr=0.0001]\nSteps: 38%|███▊ | 264/700 [01:54<03:08, 2.31it/s, loss=0.0995, lr=0.0001]\nSteps: 38%|███▊ | 264/700 [01:54<03:08, 2.31it/s, loss=0.204, lr=0.0001] \nSteps: 38%|███▊ | 265/700 [01:55<03:08, 2.31it/s, loss=0.204, lr=0.0001]\nSteps: 38%|███▊ | 265/700 [01:55<03:08, 2.31it/s, loss=0.18, lr=0.0001] \nSteps: 38%|███▊ | 266/700 [01:55<03:07, 2.31it/s, loss=0.18, lr=0.0001]\nSteps: 38%|███▊ | 266/700 [01:55<03:07, 2.31it/s, loss=0.107, lr=0.0001]\nSteps: 38%|███▊ | 267/700 [01:56<03:07, 2.31it/s, loss=0.107, lr=0.0001]\nSteps: 38%|███▊ | 267/700 [01:56<03:07, 2.31it/s, loss=0.243, lr=0.0001]\nSteps: 38%|███▊ | 268/700 [01:56<03:06, 2.31it/s, loss=0.243, lr=0.0001]\nSteps: 38%|███▊ | 268/700 [01:56<03:06, 2.31it/s, loss=0.0764, lr=0.0001]\nSteps: 38%|███▊ | 269/700 [01:56<03:06, 2.31it/s, loss=0.0764, lr=0.0001]\nSteps: 38%|███▊ | 269/700 [01:56<03:06, 2.31it/s, loss=0.103, lr=0.0001] \nSteps: 39%|███▊ | 270/700 [01:57<03:06, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 39%|███▊ | 270/700 [01:57<03:06, 2.31it/s, loss=0.114, lr=0.0001]\nSteps: 39%|███▊ | 271/700 [01:57<03:05, 2.31it/s, loss=0.114, lr=0.0001]\nSteps: 39%|███▊ | 271/700 [01:57<03:05, 2.31it/s, loss=0.206, lr=0.0001]\nSteps: 39%|███▉ | 272/700 [01:58<03:05, 2.31it/s, loss=0.206, lr=0.0001]\nSteps: 39%|███▉ | 272/700 [01:58<03:05, 2.31it/s, loss=0.108, lr=0.0001]\nSteps: 39%|███▉ | 273/700 [01:58<03:05, 2.30it/s, loss=0.108, lr=0.0001]\nSteps: 39%|███▉ | 273/700 [01:58<03:05, 2.30it/s, loss=0.14, lr=0.0001] \nSteps: 39%|███▉ | 274/700 [01:59<03:04, 2.30it/s, loss=0.14, lr=0.0001]\nSteps: 39%|███▉ | 274/700 [01:59<03:04, 2.30it/s, loss=0.0251, lr=0.0001]\nSteps: 39%|███▉ | 275/700 [01:59<03:04, 2.31it/s, loss=0.0251, lr=0.0001]\nSteps: 39%|███▉ | 275/700 [01:59<03:04, 2.31it/s, loss=0.151, lr=0.0001] \nSteps: 39%|███▉ | 276/700 [01:59<03:03, 2.31it/s, loss=0.151, lr=0.0001]\nSteps: 39%|███▉ | 276/700 [01:59<03:03, 2.31it/s, loss=0.128, lr=0.0001]\nSteps: 40%|███▉ | 277/700 [02:00<03:03, 2.31it/s, loss=0.128, lr=0.0001]\nSteps: 40%|███▉ | 277/700 [02:00<03:03, 2.31it/s, loss=0.097, lr=0.0001]\nSteps: 40%|███▉ | 278/700 [02:00<03:02, 2.31it/s, loss=0.097, lr=0.0001]\nSteps: 40%|███▉ | 278/700 [02:00<03:02, 2.31it/s, loss=0.293, lr=0.0001]\nSteps: 40%|███▉ | 279/700 [02:01<03:02, 2.31it/s, loss=0.293, lr=0.0001]\nSteps: 40%|███▉ | 279/700 [02:01<03:02, 2.31it/s, loss=0.286, lr=0.0001]\nSteps: 40%|████ | 280/700 [02:01<03:01, 2.31it/s, loss=0.286, lr=0.0001]\nSteps: 40%|████ | 280/700 [02:01<03:01, 2.31it/s, loss=0.171, lr=0.0001]\nSteps: 40%|████ | 281/700 [02:02<03:01, 2.31it/s, loss=0.171, lr=0.0001]\nSteps: 40%|████ | 281/700 [02:02<03:01, 2.31it/s, loss=0.2, lr=0.0001] \nSteps: 40%|████ | 282/700 [02:02<03:00, 2.31it/s, loss=0.2, lr=0.0001]\nSteps: 40%|████ | 282/700 [02:02<03:00, 2.31it/s, loss=0.153, lr=0.0001]\nSteps: 40%|████ | 283/700 [02:02<03:00, 2.31it/s, loss=0.153, lr=0.0001]\nSteps: 40%|████ | 283/700 [02:02<03:00, 2.31it/s, loss=0.132, lr=0.0001]\nSteps: 41%|████ | 284/700 [02:03<02:59, 2.31it/s, loss=0.132, lr=0.0001]\nSteps: 41%|████ | 284/700 [02:03<02:59, 2.31it/s, loss=0.115, lr=0.0001]\nSteps: 41%|████ | 285/700 [02:03<02:59, 2.31it/s, loss=0.115, lr=0.0001]\nSteps: 41%|████ | 285/700 [02:03<02:59, 2.31it/s, loss=0.159, lr=0.0001]\nSteps: 41%|████ | 286/700 [02:04<02:59, 2.31it/s, loss=0.159, lr=0.0001]\nSteps: 41%|████ | 286/700 [02:04<02:59, 2.31it/s, loss=0.0701, lr=0.0001]\nSteps: 41%|████ | 287/700 [02:04<02:58, 2.31it/s, loss=0.0701, lr=0.0001]\nSteps: 41%|████ | 287/700 [02:04<02:58, 2.31it/s, loss=0.134, lr=0.0001] \nSteps: 41%|████ | 288/700 [02:05<02:58, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 41%|████ | 288/700 [02:05<02:58, 2.31it/s, loss=0.188, lr=0.0001]\nSteps: 41%|████▏ | 289/700 [02:05<02:58, 2.30it/s, loss=0.188, lr=0.0001]\nSteps: 41%|████▏ | 289/700 [02:05<02:58, 2.30it/s, loss=0.0311, lr=0.0001]\nSteps: 41%|████▏ | 290/700 [02:05<02:58, 2.30it/s, loss=0.0311, lr=0.0001]\nSteps: 41%|████▏ | 290/700 [02:05<02:58, 2.30it/s, loss=0.13, lr=0.0001] \nSteps: 42%|████▏ | 291/700 [02:06<02:57, 2.30it/s, loss=0.13, lr=0.0001]\nSteps: 42%|████▏ | 291/700 [02:06<02:57, 2.30it/s, loss=0.286, lr=0.0001]\nSteps: 42%|████▏ | 292/700 [02:06<02:57, 2.30it/s, loss=0.286, lr=0.0001]\nSteps: 42%|████▏ | 292/700 [02:06<02:57, 2.30it/s, loss=0.136, lr=0.0001]\nSteps: 42%|████▏ | 293/700 [02:07<02:56, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 42%|████▏ | 293/700 [02:07<02:56, 2.31it/s, loss=0.0702, lr=0.0001]\nSteps: 42%|████▏ | 294/700 [02:07<02:55, 2.31it/s, loss=0.0702, lr=0.0001]\nSteps: 42%|████▏ | 294/700 [02:07<02:55, 2.31it/s, loss=0.161, lr=0.0001] \nSteps: 42%|████▏ | 295/700 [02:08<02:55, 2.31it/s, loss=0.161, lr=0.0001]\nSteps: 42%|████▏ | 295/700 [02:08<02:55, 2.31it/s, loss=0.0911, lr=0.0001]\nSteps: 42%|████▏ | 296/700 [02:08<02:54, 2.31it/s, loss=0.0911, lr=0.0001]\nSteps: 42%|████▏ | 296/700 [02:08<02:54, 2.31it/s, loss=0.074, lr=0.0001] \nSteps: 42%|████▏ | 297/700 [02:08<02:54, 2.31it/s, loss=0.074, lr=0.0001]\nSteps: 42%|████▏ | 297/700 [02:09<02:54, 2.31it/s, loss=0.112, lr=0.0001]\nSteps: 43%|████▎ | 298/700 [02:09<02:54, 2.31it/s, loss=0.112, lr=0.0001]\nSteps: 43%|████▎ | 298/700 [02:09<02:54, 2.31it/s, loss=0.0824, lr=0.0001]\nSteps: 43%|████▎ | 299/700 [02:09<02:53, 2.31it/s, loss=0.0824, lr=0.0001]\nSteps: 43%|████▎ | 299/700 [02:09<02:53, 2.31it/s, loss=0.124, lr=0.0001] \nSteps: 43%|████▎ | 300/700 [02:10<02:53, 2.31it/s, loss=0.124, lr=0.0001]\nSteps: 43%|████▎ | 300/700 [02:10<02:53, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 43%|████▎ | 301/700 [02:10<02:53, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 43%|████▎ | 301/700 [02:10<02:53, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 43%|████▎ | 302/700 [02:11<02:52, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 43%|████▎ | 302/700 [02:11<02:52, 2.31it/s, loss=0.0999, lr=0.0001]\nSteps: 43%|████▎ | 303/700 [02:11<02:51, 2.31it/s, loss=0.0999, lr=0.0001]\nSteps: 43%|████▎ | 303/700 [02:11<02:51, 2.31it/s, loss=0.0991, lr=0.0001]\nSteps: 43%|████▎ | 304/700 [02:12<02:51, 2.31it/s, loss=0.0991, lr=0.0001]\nSteps: 43%|████▎ | 304/700 [02:12<02:51, 2.31it/s, loss=0.206, lr=0.0001] \nSteps: 44%|████▎ | 305/700 [02:12<02:51, 2.30it/s, loss=0.206, lr=0.0001]\nSteps: 44%|████▎ | 305/700 [02:12<02:51, 2.30it/s, loss=0.0953, lr=0.0001]\nSteps: 44%|████▎ | 306/700 [02:12<02:51, 2.30it/s, loss=0.0953, lr=0.0001]\nSteps: 44%|████▎ | 306/700 [02:12<02:51, 2.30it/s, loss=0.132, lr=0.0001] \nSteps: 44%|████▍ | 307/700 [02:13<02:50, 2.31it/s, loss=0.132, lr=0.0001]\nSteps: 44%|████▍ | 307/700 [02:13<02:50, 2.31it/s, loss=0.0862, lr=0.0001]\nSteps: 44%|████▍ | 308/700 [02:13<02:49, 2.31it/s, loss=0.0862, lr=0.0001]\nSteps: 44%|████▍ | 308/700 [02:13<02:49, 2.31it/s, loss=0.0361, lr=0.0001]\nSteps: 44%|████▍ | 309/700 [02:14<02:49, 2.31it/s, loss=0.0361, lr=0.0001]\nSteps: 44%|████▍ | 309/700 [02:14<02:49, 2.31it/s, loss=0.229, lr=0.0001] \nSteps: 44%|████▍ | 310/700 [02:14<02:49, 2.31it/s, loss=0.229, lr=0.0001]\nSteps: 44%|████▍ | 310/700 [02:14<02:49, 2.31it/s, loss=0.133, lr=0.0001]\nSteps: 44%|████▍ | 311/700 [02:15<02:48, 2.31it/s, loss=0.133, lr=0.0001]\nSteps: 44%|████▍ | 311/700 [02:15<02:48, 2.31it/s, loss=0.163, lr=0.0001]\nSteps: 45%|████▍ | 312/700 [02:15<02:47, 2.31it/s, loss=0.163, lr=0.0001]\nSteps: 45%|████▍ | 312/700 [02:15<02:47, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 45%|████▍ | 313/700 [02:15<02:47, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 45%|████▍ | 313/700 [02:15<02:47, 2.31it/s, loss=0.309, lr=0.0001]\nSteps: 45%|████▍ | 314/700 [02:16<02:47, 2.31it/s, loss=0.309, lr=0.0001]\nSteps: 45%|████▍ | 314/700 [02:16<02:47, 2.31it/s, loss=0.0657, lr=0.0001]\nSteps: 45%|████▌ | 315/700 [02:16<02:46, 2.31it/s, loss=0.0657, lr=0.0001]\nSteps: 45%|████▌ | 315/700 [02:16<02:46, 2.31it/s, loss=0.0988, lr=0.0001]\nSteps: 45%|████▌ | 316/700 [02:17<02:46, 2.31it/s, loss=0.0988, lr=0.0001]\nSteps: 45%|████▌ | 316/700 [02:17<02:46, 2.31it/s, loss=0.103, lr=0.0001] \nSteps: 45%|████▌ | 317/700 [02:17<02:45, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 45%|████▌ | 317/700 [02:17<02:45, 2.31it/s, loss=0.282, lr=0.0001]\nSteps: 45%|████▌ | 318/700 [02:18<02:45, 2.31it/s, loss=0.282, lr=0.0001]\nSteps: 45%|████▌ | 318/700 [02:18<02:45, 2.31it/s, loss=0.162, lr=0.0001]\nSteps: 46%|████▌ | 319/700 [02:18<02:45, 2.31it/s, loss=0.162, lr=0.0001]\nSteps: 46%|████▌ | 319/700 [02:18<02:45, 2.31it/s, loss=0.11, lr=0.0001] \nSteps: 46%|████▌ | 320/700 [02:18<02:44, 2.31it/s, loss=0.11, lr=0.0001]\nSteps: 46%|████▌ | 320/700 [02:18<02:44, 2.31it/s, loss=0.165, lr=0.0001]\nSteps: 46%|████▌ | 321/700 [02:19<02:44, 2.30it/s, loss=0.165, lr=0.0001]\nSteps: 46%|████▌ | 321/700 [02:19<02:44, 2.30it/s, loss=0.105, lr=0.0001]\nSteps: 46%|████▌ | 322/700 [02:19<02:44, 2.30it/s, loss=0.105, lr=0.0001]\nSteps: 46%|████▌ | 322/700 [02:19<02:44, 2.30it/s, loss=0.246, lr=0.0001]\nSteps: 46%|████▌ | 323/700 [02:20<02:43, 2.30it/s, loss=0.246, lr=0.0001]\nSteps: 46%|████▌ | 323/700 [02:20<02:43, 2.30it/s, loss=0.0769, lr=0.0001]\nSteps: 46%|████▋ | 324/700 [02:20<02:43, 2.31it/s, loss=0.0769, lr=0.0001]\nSteps: 46%|████▋ | 324/700 [02:20<02:43, 2.31it/s, loss=0.101, lr=0.0001] \nSteps: 46%|████▋ | 325/700 [02:21<02:42, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 46%|████▋ | 325/700 [02:21<02:42, 2.31it/s, loss=0.161, lr=0.0001]\nSteps: 47%|████▋ | 326/700 [02:21<02:42, 2.31it/s, loss=0.161, lr=0.0001]\nSteps: 47%|████▋ | 326/700 [02:21<02:42, 2.31it/s, loss=0.175, lr=0.0001]\nSteps: 47%|████▋ | 327/700 [02:22<02:41, 2.31it/s, loss=0.175, lr=0.0001]\nSteps: 47%|████▋ | 327/700 [02:22<02:41, 2.31it/s, loss=0.147, lr=0.0001]\nSteps: 47%|████▋ | 328/700 [02:22<02:40, 2.31it/s, loss=0.147, lr=0.0001]\nSteps: 47%|████▋ | 328/700 [02:22<02:40, 2.31it/s, loss=0.258, lr=0.0001]\nSteps: 47%|████▋ | 329/700 [02:22<02:40, 2.31it/s, loss=0.258, lr=0.0001]\nSteps: 47%|████▋ | 329/700 [02:22<02:40, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 47%|████▋ | 330/700 [02:23<02:40, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 47%|████▋ | 330/700 [02:23<02:40, 2.31it/s, loss=0.0967, lr=0.0001]\nSteps: 47%|████▋ | 331/700 [02:23<02:39, 2.31it/s, loss=0.0967, lr=0.0001]\nSteps: 47%|████▋ | 331/700 [02:23<02:39, 2.31it/s, loss=0.0688, lr=0.0001]\nSteps: 47%|████▋ | 332/700 [02:24<02:39, 2.31it/s, loss=0.0688, lr=0.0001]\nSteps: 47%|████▋ | 332/700 [02:24<02:39, 2.31it/s, loss=0.102, lr=0.0001] \nSteps: 48%|████▊ | 333/700 [02:24<02:38, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 48%|████▊ | 333/700 [02:24<02:38, 2.31it/s, loss=0.0854, lr=0.0001]\nSteps: 48%|████▊ | 334/700 [02:25<02:38, 2.31it/s, loss=0.0854, lr=0.0001]\nSteps: 48%|████▊ | 334/700 [02:25<02:38, 2.31it/s, loss=0.0907, lr=0.0001]\nSteps: 48%|████▊ | 335/700 [02:25<02:37, 2.31it/s, loss=0.0907, lr=0.0001]\nSteps: 48%|████▊ | 335/700 [02:25<02:37, 2.31it/s, loss=0.243, lr=0.0001] \nSteps: 48%|████▊ | 336/700 [02:25<02:37, 2.31it/s, loss=0.243, lr=0.0001]\nSteps: 48%|████▊ | 336/700 [02:25<02:37, 2.31it/s, loss=0.182, lr=0.0001]\nSteps: 48%|████▊ | 337/700 [02:26<02:37, 2.30it/s, loss=0.182, lr=0.0001]\nSteps: 48%|████▊ | 337/700 [02:26<02:37, 2.30it/s, loss=0.165, lr=0.0001]\nSteps: 48%|████▊ | 338/700 [02:26<02:37, 2.30it/s, loss=0.165, lr=0.0001]\nSteps: 48%|████▊ | 338/700 [02:26<02:37, 2.30it/s, loss=0.116, lr=0.0001]\nSteps: 48%|████▊ | 339/700 [02:27<02:36, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 48%|████▊ | 339/700 [02:27<02:36, 2.31it/s, loss=0.0656, lr=0.0001]\nSteps: 49%|████▊ | 340/700 [02:27<02:36, 2.31it/s, loss=0.0656, lr=0.0001]\nSteps: 49%|████▊ | 340/700 [02:27<02:36, 2.31it/s, loss=0.0485, lr=0.0001]\nSteps: 49%|████▊ | 341/700 [02:28<02:35, 2.31it/s, loss=0.0485, lr=0.0001]\nSteps: 49%|████▊ | 341/700 [02:28<02:35, 2.31it/s, loss=0.0723, lr=0.0001]\nSteps: 49%|████▉ | 342/700 [02:28<02:34, 2.31it/s, loss=0.0723, lr=0.0001]\nSteps: 49%|████▉ | 342/700 [02:28<02:34, 2.31it/s, loss=0.057, lr=0.0001] \nSteps: 49%|████▉ | 343/700 [02:28<02:34, 2.31it/s, loss=0.057, lr=0.0001]\nSteps: 49%|████▉ | 343/700 [02:28<02:34, 2.31it/s, loss=0.159, lr=0.0001]\nSteps: 49%|████▉ | 344/700 [02:29<02:34, 2.31it/s, loss=0.159, lr=0.0001]\nSteps: 49%|████▉ | 344/700 [02:29<02:34, 2.31it/s, loss=0.193, lr=0.0001]\nSteps: 49%|████▉ | 345/700 [02:29<02:33, 2.31it/s, loss=0.193, lr=0.0001]\nSteps: 49%|████▉ | 345/700 [02:29<02:33, 2.31it/s, loss=0.236, lr=0.0001]\nSteps: 49%|████▉ | 346/700 [02:30<02:33, 2.31it/s, loss=0.236, lr=0.0001]\nSteps: 49%|████▉ | 346/700 [02:30<02:33, 2.31it/s, loss=0.108, lr=0.0001]\nSteps: 50%|████▉ | 347/700 [02:30<02:33, 2.31it/s, loss=0.108, lr=0.0001]\nSteps: 50%|████▉ | 347/700 [02:30<02:33, 2.31it/s, loss=0.0848, lr=0.0001]\nSteps: 50%|████▉ | 348/700 [02:31<02:32, 2.31it/s, loss=0.0848, lr=0.0001]\nSteps: 50%|████▉ | 348/700 [02:31<02:32, 2.31it/s, loss=0.135, lr=0.0001] \nSteps: 50%|████▉ | 349/700 [02:31<02:32, 2.31it/s, loss=0.135, lr=0.0001]\nSteps: 50%|████▉ | 349/700 [02:31<02:32, 2.31it/s, loss=0.141, lr=0.0001]\nSteps: 50%|█████ | 350/700 [02:31<02:31, 2.31it/s, loss=0.141, lr=0.0001]\nSteps: 50%|█████ | 350/700 [02:31<02:31, 2.31it/s, loss=0.0529, lr=0.0001]\nSteps: 50%|█████ | 351/700 [02:32<02:31, 2.31it/s, loss=0.0529, lr=0.0001]\nSteps: 50%|█████ | 351/700 [02:32<02:31, 2.31it/s, loss=0.0894, lr=0.0001]\nSteps: 50%|█████ | 352/700 [02:32<02:30, 2.31it/s, loss=0.0894, lr=0.0001]\nSteps: 50%|█████ | 352/700 [02:32<02:30, 2.31it/s, loss=0.343, lr=0.0001] \nSteps: 50%|█████ | 353/700 [02:33<02:30, 2.30it/s, loss=0.343, lr=0.0001]\nSteps: 50%|█████ | 353/700 [02:33<02:30, 2.30it/s, loss=0.195, lr=0.0001]\nSteps: 51%|█████ | 354/700 [02:33<02:30, 2.30it/s, loss=0.195, lr=0.0001]\nSteps: 51%|█████ | 354/700 [02:33<02:30, 2.30it/s, loss=0.107, lr=0.0001]\nSteps: 51%|█████ | 355/700 [02:34<02:29, 2.30it/s, loss=0.107, lr=0.0001]\nSteps: 51%|█████ | 355/700 [02:34<02:29, 2.30it/s, loss=0.0284, lr=0.0001]\nSteps: 51%|█████ | 356/700 [02:34<02:29, 2.31it/s, loss=0.0284, lr=0.0001]\nSteps: 51%|█████ | 356/700 [02:34<02:29, 2.31it/s, loss=0.167, lr=0.0001] \nSteps: 51%|█████ | 357/700 [02:35<02:28, 2.31it/s, loss=0.167, lr=0.0001]\nSteps: 51%|█████ | 357/700 [02:35<02:28, 2.31it/s, loss=0.14, lr=0.0001] \nSteps: 51%|█████ | 358/700 [02:35<02:28, 2.31it/s, loss=0.14, lr=0.0001]\nSteps: 51%|█████ | 358/700 [02:35<02:28, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 51%|█████▏ | 359/700 [02:35<02:27, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 51%|█████▏ | 359/700 [02:35<02:27, 2.31it/s, loss=0.199, lr=0.0001]\nSteps: 51%|█████▏ | 360/700 [02:36<02:27, 2.31it/s, loss=0.199, lr=0.0001]\nSteps: 51%|█████▏ | 360/700 [02:36<02:27, 2.31it/s, loss=0.2, lr=0.0001] \nSteps: 52%|█████▏ | 361/700 [02:36<02:26, 2.31it/s, loss=0.2, lr=0.0001]\nSteps: 52%|█████▏ | 361/700 [02:36<02:26, 2.31it/s, loss=0.0617, lr=0.0001]\nSteps: 52%|█████▏ | 362/700 [02:37<02:26, 2.31it/s, loss=0.0617, lr=0.0001]\nSteps: 52%|█████▏ | 362/700 [02:37<02:26, 2.31it/s, loss=0.202, lr=0.0001] \nSteps: 52%|█████▏ | 363/700 [02:37<02:25, 2.31it/s, loss=0.202, lr=0.0001]\nSteps: 52%|█████▏ | 363/700 [02:37<02:25, 2.31it/s, loss=0.081, lr=0.0001]\nSteps: 52%|█████▏ | 364/700 [02:38<02:25, 2.31it/s, loss=0.081, lr=0.0001]\nSteps: 52%|█████▏ | 364/700 [02:38<02:25, 2.31it/s, loss=0.158, lr=0.0001]\nSteps: 52%|█████▏ | 365/700 [02:38<02:25, 2.31it/s, loss=0.158, lr=0.0001]\nSteps: 52%|█████▏ | 365/700 [02:38<02:25, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 52%|█████▏ | 366/700 [02:38<02:24, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 52%|█████▏ | 366/700 [02:38<02:24, 2.31it/s, loss=0.166, lr=0.0001]\nSteps: 52%|█████▏ | 367/700 [02:39<02:24, 2.31it/s, loss=0.166, lr=0.0001]\nSteps: 52%|█████▏ | 367/700 [02:39<02:24, 2.31it/s, loss=0.261, lr=0.0001]\nSteps: 53%|█████▎ | 368/700 [02:39<02:23, 2.31it/s, loss=0.261, lr=0.0001]\nSteps: 53%|█████▎ | 368/700 [02:39<02:23, 2.31it/s, loss=0.119, lr=0.0001]\nSteps: 53%|█████▎ | 369/700 [02:40<02:24, 2.30it/s, loss=0.119, lr=0.0001]\nSteps: 53%|█████▎ | 369/700 [02:40<02:24, 2.30it/s, loss=0.0896, lr=0.0001]\nSteps: 53%|█████▎ | 370/700 [02:40<02:23, 2.30it/s, loss=0.0896, lr=0.0001]\nSteps: 53%|█████▎ | 370/700 [02:40<02:23, 2.30it/s, loss=0.101, lr=0.0001] \nSteps: 53%|█████▎ | 371/700 [02:41<02:23, 2.30it/s, loss=0.101, lr=0.0001]\nSteps: 53%|█████▎ | 371/700 [02:41<02:23, 2.30it/s, loss=0.112, lr=0.0001]\nSteps: 53%|█████▎ | 372/700 [02:41<02:22, 2.30it/s, loss=0.112, lr=0.0001]\nSteps: 53%|█████▎ | 372/700 [02:41<02:22, 2.30it/s, loss=0.132, lr=0.0001]\nSteps: 53%|█████▎ | 373/700 [02:41<02:21, 2.30it/s, loss=0.132, lr=0.0001]\nSteps: 53%|█████▎ | 373/700 [02:41<02:21, 2.30it/s, loss=0.15, lr=0.0001] \nSteps: 53%|█████▎ | 374/700 [02:42<02:21, 2.31it/s, loss=0.15, lr=0.0001]\nSteps: 53%|█████▎ | 374/700 [02:42<02:21, 2.31it/s, loss=0.326, lr=0.0001]\nSteps: 54%|█████▎ | 375/700 [02:42<02:20, 2.31it/s, loss=0.326, lr=0.0001]\nSteps: 54%|█████▎ | 375/700 [02:42<02:20, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 54%|█████▎ | 376/700 [02:43<02:20, 2.30it/s, loss=0.117, lr=0.0001]\nSteps: 54%|█████▎ | 376/700 [02:43<02:20, 2.30it/s, loss=0.128, lr=0.0001]\nSteps: 54%|█████▍ | 377/700 [02:43<02:20, 2.30it/s, loss=0.128, lr=0.0001]\nSteps: 54%|█████▍ | 377/700 [02:43<02:20, 2.30it/s, loss=0.146, lr=0.0001]\nSteps: 54%|█████▍ | 378/700 [02:44<02:19, 2.31it/s, loss=0.146, lr=0.0001]\nSteps: 54%|█████▍ | 378/700 [02:44<02:19, 2.31it/s, loss=0.219, lr=0.0001]\nSteps: 54%|█████▍ | 379/700 [02:44<02:19, 2.31it/s, loss=0.219, lr=0.0001]\nSteps: 54%|█████▍ | 379/700 [02:44<02:19, 2.31it/s, loss=0.0741, lr=0.0001]\nSteps: 54%|█████▍ | 380/700 [02:44<02:18, 2.31it/s, loss=0.0741, lr=0.0001]\nSteps: 54%|█████▍ | 380/700 [02:45<02:18, 2.31it/s, loss=0.104, lr=0.0001] \nSteps: 54%|█████▍ | 381/700 [02:45<02:18, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 54%|█████▍ | 381/700 [02:45<02:18, 2.31it/s, loss=0.0772, lr=0.0001]\nSteps: 55%|█████▍ | 382/700 [02:45<02:18, 2.30it/s, loss=0.0772, lr=0.0001]\nSteps: 55%|█████▍ | 382/700 [02:45<02:18, 2.30it/s, loss=0.213, lr=0.0001] \nSteps: 55%|█████▍ | 383/700 [02:46<02:29, 2.11it/s, loss=0.213, lr=0.0001]\nSteps: 55%|█████▍ | 383/700 [02:46<02:29, 2.11it/s, loss=0.197, lr=0.0001]\nSteps: 55%|█████▍ | 384/700 [02:46<02:25, 2.16it/s, loss=0.197, lr=0.0001]\nSteps: 55%|█████▍ | 384/700 [02:46<02:25, 2.16it/s, loss=0.172, lr=0.0001]\nSteps: 55%|█████▌ | 385/700 [02:47<02:23, 2.20it/s, loss=0.172, lr=0.0001]\nSteps: 55%|█████▌ | 385/700 [02:47<02:23, 2.20it/s, loss=0.108, lr=0.0001]\nSteps: 55%|█████▌ | 386/700 [02:47<02:20, 2.23it/s, loss=0.108, lr=0.0001]\nSteps: 55%|█████▌ | 386/700 [02:47<02:20, 2.23it/s, loss=0.0851, lr=0.0001]\nSteps: 55%|█████▌ | 387/700 [02:48<02:18, 2.25it/s, loss=0.0851, lr=0.0001]\nSteps: 55%|█████▌ | 387/700 [02:48<02:18, 2.25it/s, loss=0.037, lr=0.0001] \nSteps: 55%|█████▌ | 388/700 [02:48<02:17, 2.27it/s, loss=0.037, lr=0.0001]\nSteps: 55%|█████▌ | 388/700 [02:48<02:17, 2.27it/s, loss=0.278, lr=0.0001]\nSteps: 56%|█████▌ | 389/700 [02:49<02:16, 2.28it/s, loss=0.278, lr=0.0001]\nSteps: 56%|█████▌ | 389/700 [02:49<02:16, 2.28it/s, loss=0.0438, lr=0.0001]\nSteps: 56%|█████▌ | 390/700 [02:49<02:15, 2.29it/s, loss=0.0438, lr=0.0001]\nSteps: 56%|█████▌ | 390/700 [02:49<02:15, 2.29it/s, loss=0.171, lr=0.0001] \nSteps: 56%|█████▌ | 391/700 [02:49<02:14, 2.29it/s, loss=0.171, lr=0.0001]\nSteps: 56%|█████▌ | 391/700 [02:49<02:14, 2.29it/s, loss=0.0965, lr=0.0001]\nSteps: 56%|█████▌ | 392/700 [02:50<02:14, 2.30it/s, loss=0.0965, lr=0.0001]\nSteps: 56%|█████▌ | 392/700 [02:50<02:14, 2.30it/s, loss=0.061, lr=0.0001] \nSteps: 56%|█████▌ | 393/700 [02:50<02:13, 2.30it/s, loss=0.061, lr=0.0001]\nSteps: 56%|█████▌ | 393/700 [02:50<02:13, 2.30it/s, loss=0.0909, lr=0.0001]\nSteps: 56%|█████▋ | 394/700 [02:51<02:12, 2.30it/s, loss=0.0909, lr=0.0001]\nSteps: 56%|█████▋ | 394/700 [02:51<02:12, 2.30it/s, loss=0.0822, lr=0.0001]\nSteps: 56%|█████▋ | 395/700 [02:51<02:12, 2.31it/s, loss=0.0822, lr=0.0001]\nSteps: 56%|█████▋ | 395/700 [02:51<02:12, 2.31it/s, loss=0.0202, lr=0.0001]\nSteps: 57%|█████▋ | 396/700 [02:52<02:11, 2.31it/s, loss=0.0202, lr=0.0001]\nSteps: 57%|█████▋ | 396/700 [02:52<02:11, 2.31it/s, loss=0.084, lr=0.0001] \nSteps: 57%|█████▋ | 397/700 [02:52<02:11, 2.31it/s, loss=0.084, lr=0.0001]\nSteps: 57%|█████▋ | 397/700 [02:52<02:11, 2.31it/s, loss=0.165, lr=0.0001]\nSteps: 57%|█████▋ | 398/700 [02:52<02:10, 2.31it/s, loss=0.165, lr=0.0001]\nSteps: 57%|█████▋ | 398/700 [02:52<02:10, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 57%|█████▋ | 399/700 [02:53<02:10, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 57%|█████▋ | 399/700 [02:53<02:10, 2.31it/s, loss=0.17, lr=0.0001] \nSteps: 57%|█████▋ | 400/700 [02:53<02:09, 2.31it/s, loss=0.17, lr=0.0001]\nSteps: 57%|█████▋ | 400/700 [02:53<02:09, 2.31it/s, loss=0.176, lr=0.0001]\nSteps: 57%|█████▋ | 401/700 [02:54<02:10, 2.30it/s, loss=0.176, lr=0.0001]\nSteps: 57%|█████▋ | 401/700 [02:54<02:10, 2.30it/s, loss=0.165, lr=0.0001]\nSteps: 57%|█████▋ | 402/700 [02:54<02:09, 2.30it/s, loss=0.165, lr=0.0001]\nSteps: 57%|█████▋ | 402/700 [02:54<02:09, 2.30it/s, loss=0.0535, lr=0.0001]\nSteps: 58%|█████▊ | 403/700 [02:55<02:08, 2.31it/s, loss=0.0535, lr=0.0001]\nSteps: 58%|█████▊ | 403/700 [02:55<02:08, 2.31it/s, loss=0.15, lr=0.0001] \nSteps: 58%|█████▊ | 404/700 [02:55<02:08, 2.31it/s, loss=0.15, lr=0.0001]\nSteps: 58%|█████▊ | 404/700 [02:55<02:08, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 58%|█████▊ | 405/700 [02:55<02:07, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 58%|█████▊ | 405/700 [02:55<02:07, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 58%|█████▊ | 406/700 [02:56<02:07, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 58%|█████▊ | 406/700 [02:56<02:07, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 58%|█████▊ | 407/700 [02:56<02:06, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 58%|█████▊ | 407/700 [02:56<02:06, 2.31it/s, loss=0.135, lr=0.0001]\nSteps: 58%|█████▊ | 408/700 [02:57<02:06, 2.31it/s, loss=0.135, lr=0.0001]\nSteps: 58%|█████▊ | 408/700 [02:57<02:06, 2.31it/s, loss=0.0779, lr=0.0001]\nSteps: 58%|█████▊ | 409/700 [02:57<02:05, 2.31it/s, loss=0.0779, lr=0.0001]\nSteps: 58%|█████▊ | 409/700 [02:57<02:05, 2.31it/s, loss=0.125, lr=0.0001] \nSteps: 59%|█████▊ | 410/700 [02:58<02:05, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 59%|█████▊ | 410/700 [02:58<02:05, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 59%|█████▊ | 411/700 [02:58<02:05, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 59%|█████▊ | 411/700 [02:58<02:05, 2.31it/s, loss=0.187, lr=0.0001]\nSteps: 59%|█████▉ | 412/700 [02:58<02:04, 2.31it/s, loss=0.187, lr=0.0001]\nSteps: 59%|█████▉ | 412/700 [02:59<02:04, 2.31it/s, loss=0.0657, lr=0.0001]\nSteps: 59%|█████▉ | 413/700 [02:59<02:04, 2.31it/s, loss=0.0657, lr=0.0001]\nSteps: 59%|█████▉ | 413/700 [02:59<02:04, 2.31it/s, loss=0.0886, lr=0.0001]\nSteps: 59%|█████▉ | 414/700 [02:59<02:03, 2.31it/s, loss=0.0886, lr=0.0001]\nSteps: 59%|█████▉ | 414/700 [02:59<02:03, 2.31it/s, loss=0.127, lr=0.0001] \nSteps: 59%|█████▉ | 415/700 [03:00<02:03, 2.31it/s, loss=0.127, lr=0.0001]\nSteps: 59%|█████▉ | 415/700 [03:00<02:03, 2.31it/s, loss=0.0474, lr=0.0001]\nSteps: 59%|█████▉ | 416/700 [03:00<02:02, 2.31it/s, loss=0.0474, lr=0.0001]\nSteps: 59%|█████▉ | 416/700 [03:00<02:02, 2.31it/s, loss=0.135, lr=0.0001] \nSteps: 60%|█████▉ | 417/700 [03:01<02:03, 2.30it/s, loss=0.135, lr=0.0001]\nSteps: 60%|█████▉ | 417/700 [03:01<02:03, 2.30it/s, loss=0.127, lr=0.0001]\nSteps: 60%|█████▉ | 418/700 [03:01<02:02, 2.30it/s, loss=0.127, lr=0.0001]\nSteps: 60%|█████▉ | 418/700 [03:01<02:02, 2.30it/s, loss=0.136, lr=0.0001]\nSteps: 60%|█████▉ | 419/700 [03:02<02:01, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 60%|█████▉ | 419/700 [03:02<02:01, 2.31it/s, loss=0.197, lr=0.0001]\nSteps: 60%|██████ | 420/700 [03:02<02:01, 2.31it/s, loss=0.197, lr=0.0001]\nSteps: 60%|██████ | 420/700 [03:02<02:01, 2.31it/s, loss=0.0675, lr=0.0001]\nSteps: 60%|██████ | 421/700 [03:02<02:00, 2.31it/s, loss=0.0675, lr=0.0001]\nSteps: 60%|██████ | 421/700 [03:02<02:00, 2.31it/s, loss=0.0898, lr=0.0001]\nSteps: 60%|██████ | 422/700 [03:03<02:00, 2.31it/s, loss=0.0898, lr=0.0001]\nSteps: 60%|██████ | 422/700 [03:03<02:00, 2.31it/s, loss=0.118, lr=0.0001] \nSteps: 60%|██████ | 423/700 [03:03<01:59, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 60%|██████ | 423/700 [03:03<01:59, 2.31it/s, loss=0.14, lr=0.0001] \nSteps: 61%|██████ | 424/700 [03:04<01:59, 2.31it/s, loss=0.14, lr=0.0001]\nSteps: 61%|██████ | 424/700 [03:04<01:59, 2.31it/s, loss=0.0937, lr=0.0001]\nSteps: 61%|██████ | 425/700 [03:04<01:59, 2.31it/s, loss=0.0937, lr=0.0001]\nSteps: 61%|██████ | 425/700 [03:04<01:59, 2.31it/s, loss=0.138, lr=0.0001] \nSteps: 61%|██████ | 426/700 [03:05<01:58, 2.31it/s, loss=0.138, lr=0.0001]\nSteps: 61%|██████ | 426/700 [03:05<01:58, 2.31it/s, loss=0.158, lr=0.0001]\nSteps: 61%|██████ | 427/700 [03:05<01:58, 2.31it/s, loss=0.158, lr=0.0001]\nSteps: 61%|██████ | 427/700 [03:05<01:58, 2.31it/s, loss=0.0508, lr=0.0001]\nSteps: 61%|██████ | 428/700 [03:05<01:57, 2.31it/s, loss=0.0508, lr=0.0001]\nSteps: 61%|██████ | 428/700 [03:05<01:57, 2.31it/s, loss=0.0954, lr=0.0001]\nSteps: 61%|██████▏ | 429/700 [03:06<01:57, 2.31it/s, loss=0.0954, lr=0.0001]\nSteps: 61%|██████▏ | 429/700 [03:06<01:57, 2.31it/s, loss=0.315, lr=0.0001] \nSteps: 61%|██████▏ | 430/700 [03:06<01:56, 2.31it/s, loss=0.315, lr=0.0001]\nSteps: 61%|██████▏ | 430/700 [03:06<01:56, 2.31it/s, loss=0.166, lr=0.0001]\nSteps: 62%|██████▏ | 431/700 [03:07<01:56, 2.31it/s, loss=0.166, lr=0.0001]\nSteps: 62%|██████▏ | 431/700 [03:07<01:56, 2.31it/s, loss=0.09, lr=0.0001] \nSteps: 62%|██████▏ | 432/700 [03:07<01:55, 2.31it/s, loss=0.09, lr=0.0001]\nSteps: 62%|██████▏ | 432/700 [03:07<01:55, 2.31it/s, loss=0.0611, lr=0.0001]\nSteps: 62%|██████▏ | 433/700 [03:08<01:56, 2.30it/s, loss=0.0611, lr=0.0001]\nSteps: 62%|██████▏ | 433/700 [03:08<01:56, 2.30it/s, loss=0.23, lr=0.0001] \nSteps: 62%|██████▏ | 434/700 [03:08<01:55, 2.30it/s, loss=0.23, lr=0.0001]\nSteps: 62%|██████▏ | 434/700 [03:08<01:55, 2.30it/s, loss=0.221, lr=0.0001]\nSteps: 62%|██████▏ | 435/700 [03:08<01:55, 2.30it/s, loss=0.221, lr=0.0001]\nSteps: 62%|██████▏ | 435/700 [03:08<01:55, 2.30it/s, loss=0.0432, lr=0.0001]\nSteps: 62%|██████▏ | 436/700 [03:09<01:54, 2.31it/s, loss=0.0432, lr=0.0001]\nSteps: 62%|██████▏ | 436/700 [03:09<01:54, 2.31it/s, loss=0.127, lr=0.0001] \nSteps: 62%|██████▏ | 437/700 [03:09<01:53, 2.31it/s, loss=0.127, lr=0.0001]\nSteps: 62%|██████▏ | 437/700 [03:09<01:53, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 63%|██████▎ | 438/700 [03:10<01:53, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 63%|██████▎ | 438/700 [03:10<01:53, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 63%|██████▎ | 439/700 [03:10<01:53, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 63%|██████▎ | 439/700 [03:10<01:53, 2.31it/s, loss=0.0318, lr=0.0001]\nSteps: 63%|██████▎ | 440/700 [03:11<01:52, 2.31it/s, loss=0.0318, lr=0.0001]\nSteps: 63%|██████▎ | 440/700 [03:11<01:52, 2.31it/s, loss=0.109, lr=0.0001] \nSteps: 63%|██████▎ | 441/700 [03:11<01:52, 2.31it/s, loss=0.109, lr=0.0001]\nSteps: 63%|██████▎ | 441/700 [03:11<01:52, 2.31it/s, loss=0.0869, lr=0.0001]\nSteps: 63%|██████▎ | 442/700 [03:11<01:51, 2.31it/s, loss=0.0869, lr=0.0001]\nSteps: 63%|██████▎ | 442/700 [03:12<01:51, 2.31it/s, loss=0.0479, lr=0.0001]\nSteps: 63%|██████▎ | 443/700 [03:12<01:51, 2.31it/s, loss=0.0479, lr=0.0001]\nSteps: 63%|██████▎ | 443/700 [03:12<01:51, 2.31it/s, loss=0.0615, lr=0.0001]\nSteps: 63%|██████▎ | 444/700 [03:12<01:50, 2.31it/s, loss=0.0615, lr=0.0001]\nSteps: 63%|██████▎ | 444/700 [03:12<01:50, 2.31it/s, loss=0.0695, lr=0.0001]\nSteps: 64%|██████▎ | 445/700 [03:13<01:50, 2.31it/s, loss=0.0695, lr=0.0001]\nSteps: 64%|██████▎ | 445/700 [03:13<01:50, 2.31it/s, loss=0.109, lr=0.0001] \nSteps: 64%|██████▎ | 446/700 [03:13<01:49, 2.31it/s, loss=0.109, lr=0.0001]\nSteps: 64%|██████▎ | 446/700 [03:13<01:49, 2.31it/s, loss=0.155, lr=0.0001]\nSteps: 64%|██████▍ | 447/700 [03:14<01:49, 2.31it/s, loss=0.155, lr=0.0001]\nSteps: 64%|██████▍ | 447/700 [03:14<01:49, 2.31it/s, loss=0.0106, lr=0.0001]\nSteps: 64%|██████▍ | 448/700 [03:14<01:49, 2.31it/s, loss=0.0106, lr=0.0001]\nSteps: 64%|██████▍ | 448/700 [03:14<01:49, 2.31it/s, loss=0.176, lr=0.0001] \nSteps: 64%|██████▍ | 449/700 [03:15<01:49, 2.30it/s, loss=0.176, lr=0.0001]\nSteps: 64%|██████▍ | 449/700 [03:15<01:49, 2.30it/s, loss=0.193, lr=0.0001]\nSteps: 64%|██████▍ | 450/700 [03:15<01:48, 2.30it/s, loss=0.193, lr=0.0001]\nSteps: 64%|██████▍ | 450/700 [03:15<01:48, 2.30it/s, loss=0.104, lr=0.0001]\nSteps: 64%|██████▍ | 451/700 [03:15<01:47, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 64%|██████▍ | 451/700 [03:15<01:47, 2.31it/s, loss=0.0734, lr=0.0001]\nSteps: 65%|██████▍ | 452/700 [03:16<01:47, 2.31it/s, loss=0.0734, lr=0.0001]\nSteps: 65%|██████▍ | 452/700 [03:16<01:47, 2.31it/s, loss=0.272, lr=0.0001] \nSteps: 65%|██████▍ | 453/700 [03:16<01:47, 2.31it/s, loss=0.272, lr=0.0001]\nSteps: 65%|██████▍ | 453/700 [03:16<01:47, 2.31it/s, loss=0.0395, lr=0.0001]\nSteps: 65%|██████▍ | 454/700 [03:17<01:46, 2.31it/s, loss=0.0395, lr=0.0001]\nSteps: 65%|██████▍ | 454/700 [03:17<01:46, 2.31it/s, loss=0.118, lr=0.0001] \nSteps: 65%|██████▌ | 455/700 [03:17<01:46, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 65%|██████▌ | 455/700 [03:17<01:46, 2.31it/s, loss=0.0978, lr=0.0001]\nSteps: 65%|██████▌ | 456/700 [03:18<01:45, 2.31it/s, loss=0.0978, lr=0.0001]\nSteps: 65%|██████▌ | 456/700 [03:18<01:45, 2.31it/s, loss=0.152, lr=0.0001] \nSteps: 65%|██████▌ | 457/700 [03:18<01:45, 2.31it/s, loss=0.152, lr=0.0001]\nSteps: 65%|██████▌ | 457/700 [03:18<01:45, 2.31it/s, loss=0.095, lr=0.0001]\nSteps: 65%|██████▌ | 458/700 [03:18<01:44, 2.31it/s, loss=0.095, lr=0.0001]\nSteps: 65%|██████▌ | 458/700 [03:18<01:44, 2.31it/s, loss=0.178, lr=0.0001]\nSteps: 66%|██████▌ | 459/700 [03:19<01:44, 2.31it/s, loss=0.178, lr=0.0001]\nSteps: 66%|██████▌ | 459/700 [03:19<01:44, 2.31it/s, loss=0.161, lr=0.0001]\nSteps: 66%|██████▌ | 460/700 [03:19<01:43, 2.31it/s, loss=0.161, lr=0.0001]\nSteps: 66%|██████▌ | 460/700 [03:19<01:43, 2.31it/s, loss=0.135, lr=0.0001]\nSteps: 66%|██████▌ | 461/700 [03:20<01:43, 2.31it/s, loss=0.135, lr=0.0001]\nSteps: 66%|██████▌ | 461/700 [03:20<01:43, 2.31it/s, loss=0.165, lr=0.0001]\nSteps: 66%|██████▌ | 462/700 [03:20<01:43, 2.31it/s, loss=0.165, lr=0.0001]\nSteps: 66%|██████▌ | 462/700 [03:20<01:43, 2.31it/s, loss=0.162, lr=0.0001]\nSteps: 66%|██████▌ | 463/700 [03:21<01:42, 2.31it/s, loss=0.162, lr=0.0001]\nSteps: 66%|██████▌ | 463/700 [03:21<01:42, 2.31it/s, loss=0.177, lr=0.0001]\nSteps: 66%|██████▋ | 464/700 [03:21<01:42, 2.31it/s, loss=0.177, lr=0.0001]\nSteps: 66%|██████▋ | 464/700 [03:21<01:42, 2.31it/s, loss=0.158, lr=0.0001]\nSteps: 66%|██████▋ | 465/700 [03:21<01:42, 2.30it/s, loss=0.158, lr=0.0001]\nSteps: 66%|██████▋ | 465/700 [03:21<01:42, 2.30it/s, loss=0.203, lr=0.0001]\nSteps: 67%|██████▋ | 466/700 [03:22<01:41, 2.30it/s, loss=0.203, lr=0.0001]\nSteps: 67%|██████▋ | 466/700 [03:22<01:41, 2.30it/s, loss=0.0449, lr=0.0001]\nSteps: 67%|██████▋ | 467/700 [03:22<01:41, 2.31it/s, loss=0.0449, lr=0.0001]\nSteps: 67%|██████▋ | 467/700 [03:22<01:41, 2.31it/s, loss=0.259, lr=0.0001] \nSteps: 67%|██████▋ | 468/700 [03:23<01:40, 2.31it/s, loss=0.259, lr=0.0001]\nSteps: 67%|██████▋ | 468/700 [03:23<01:40, 2.31it/s, loss=0.177, lr=0.0001]\nSteps: 67%|██████▋ | 469/700 [03:23<01:40, 2.31it/s, loss=0.177, lr=0.0001]\nSteps: 67%|██████▋ | 469/700 [03:23<01:40, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 67%|██████▋ | 470/700 [03:24<01:39, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 67%|██████▋ | 470/700 [03:24<01:39, 2.31it/s, loss=0.164, lr=0.0001]\nSteps: 67%|██████▋ | 471/700 [03:24<01:39, 2.31it/s, loss=0.164, lr=0.0001]\nSteps: 67%|██████▋ | 471/700 [03:24<01:39, 2.31it/s, loss=0.0637, lr=0.0001]\nSteps: 67%|██████▋ | 472/700 [03:24<01:38, 2.31it/s, loss=0.0637, lr=0.0001]\nSteps: 67%|██████▋ | 472/700 [03:25<01:38, 2.31it/s, loss=0.101, lr=0.0001] \nSteps: 68%|██████▊ | 473/700 [03:25<01:38, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 68%|██████▊ | 473/700 [03:25<01:38, 2.31it/s, loss=0.197, lr=0.0001]\nSteps: 68%|██████▊ | 474/700 [03:25<01:37, 2.31it/s, loss=0.197, lr=0.0001]\nSteps: 68%|██████▊ | 474/700 [03:25<01:37, 2.31it/s, loss=0.246, lr=0.0001]\nSteps: 68%|██████▊ | 475/700 [03:26<01:37, 2.31it/s, loss=0.246, lr=0.0001]\nSteps: 68%|██████▊ | 475/700 [03:26<01:37, 2.31it/s, loss=0.0803, lr=0.0001]\nSteps: 68%|██████▊ | 476/700 [03:26<01:36, 2.31it/s, loss=0.0803, lr=0.0001]\nSteps: 68%|██████▊ | 476/700 [03:26<01:36, 2.31it/s, loss=0.131, lr=0.0001] \nSteps: 68%|██████▊ | 477/700 [03:27<01:36, 2.31it/s, loss=0.131, lr=0.0001]\nSteps: 68%|██████▊ | 477/700 [03:27<01:36, 2.31it/s, loss=0.0571, lr=0.0001]\nSteps: 68%|██████▊ | 478/700 [03:27<01:36, 2.31it/s, loss=0.0571, lr=0.0001]\nSteps: 68%|██████▊ | 478/700 [03:27<01:36, 2.31it/s, loss=0.126, lr=0.0001] \nSteps: 68%|██████▊ | 479/700 [03:27<01:35, 2.31it/s, loss=0.126, lr=0.0001]\nSteps: 68%|██████▊ | 479/700 [03:28<01:35, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 69%|██████▊ | 480/700 [03:28<01:35, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 69%|██████▊ | 480/700 [03:28<01:35, 2.31it/s, loss=0.0757, lr=0.0001]\nSteps: 69%|██████▊ | 481/700 [03:28<01:35, 2.30it/s, loss=0.0757, lr=0.0001]\nSteps: 69%|██████▊ | 481/700 [03:28<01:35, 2.30it/s, loss=0.118, lr=0.0001] \nSteps: 69%|██████▉ | 482/700 [03:29<01:34, 2.30it/s, loss=0.118, lr=0.0001]\nSteps: 69%|██████▉ | 482/700 [03:29<01:34, 2.30it/s, loss=0.233, lr=0.0001]\nSteps: 69%|██████▉ | 483/700 [03:29<01:34, 2.30it/s, loss=0.233, lr=0.0001]\nSteps: 69%|██████▉ | 483/700 [03:29<01:34, 2.30it/s, loss=0.146, lr=0.0001]\nSteps: 69%|██████▉ | 484/700 [03:30<01:33, 2.31it/s, loss=0.146, lr=0.0001]\nSteps: 69%|██████▉ | 484/700 [03:30<01:33, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 69%|██████▉ | 485/700 [03:30<01:33, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 69%|██████▉ | 485/700 [03:30<01:33, 2.31it/s, loss=0.179, lr=0.0001]\nSteps: 69%|██████▉ | 486/700 [03:31<01:32, 2.31it/s, loss=0.179, lr=0.0001]\nSteps: 69%|██████▉ | 486/700 [03:31<01:32, 2.31it/s, loss=0.0674, lr=0.0001]\nSteps: 70%|██████▉ | 487/700 [03:31<01:32, 2.31it/s, loss=0.0674, lr=0.0001]\nSteps: 70%|██████▉ | 487/700 [03:31<01:32, 2.31it/s, loss=0.187, lr=0.0001] \nSteps: 70%|██████▉ | 488/700 [03:31<01:31, 2.31it/s, loss=0.187, lr=0.0001]\nSteps: 70%|██████▉ | 488/700 [03:31<01:31, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 70%|██████▉ | 489/700 [03:32<01:31, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 70%|██████▉ | 489/700 [03:32<01:31, 2.31it/s, loss=0.0499, lr=0.0001]\nSteps: 70%|███████ | 490/700 [03:32<01:30, 2.31it/s, loss=0.0499, lr=0.0001]\nSteps: 70%|███████ | 490/700 [03:32<01:30, 2.31it/s, loss=0.11, lr=0.0001] \nSteps: 70%|███████ | 491/700 [03:33<01:30, 2.31it/s, loss=0.11, lr=0.0001]\nSteps: 70%|███████ | 491/700 [03:33<01:30, 2.31it/s, loss=0.0632, lr=0.0001]\nSteps: 70%|███████ | 492/700 [03:33<01:30, 2.31it/s, loss=0.0632, lr=0.0001]\nSteps: 70%|███████ | 492/700 [03:33<01:30, 2.31it/s, loss=0.0964, lr=0.0001]\nSteps: 70%|███████ | 493/700 [03:34<01:29, 2.31it/s, loss=0.0964, lr=0.0001]\nSteps: 70%|███████ | 493/700 [03:34<01:29, 2.31it/s, loss=0.0333, lr=0.0001]\nSteps: 71%|███████ | 494/700 [03:34<01:29, 2.31it/s, loss=0.0333, lr=0.0001]\nSteps: 71%|███████ | 494/700 [03:34<01:29, 2.31it/s, loss=0.094, lr=0.0001] \nSteps: 71%|███████ | 495/700 [03:34<01:28, 2.31it/s, loss=0.094, lr=0.0001]\nSteps: 71%|███████ | 495/700 [03:34<01:28, 2.31it/s, loss=0.115, lr=0.0001]\nSteps: 71%|███████ | 496/700 [03:35<01:28, 2.31it/s, loss=0.115, lr=0.0001]\nSteps: 71%|███████ | 496/700 [03:35<01:28, 2.31it/s, loss=0.0327, lr=0.0001]\nSteps: 71%|███████ | 497/700 [03:35<01:28, 2.30it/s, loss=0.0327, lr=0.0001]\nSteps: 71%|███████ | 497/700 [03:35<01:28, 2.30it/s, loss=0.14, lr=0.0001] \nSteps: 71%|███████ | 498/700 [03:36<01:27, 2.30it/s, loss=0.14, lr=0.0001]\nSteps: 71%|███████ | 498/700 [03:36<01:27, 2.30it/s, loss=0.0866, lr=0.0001]\nSteps: 71%|███████▏ | 499/700 [03:36<01:27, 2.31it/s, loss=0.0866, lr=0.0001]\nSteps: 71%|███████▏ | 499/700 [03:36<01:27, 2.31it/s, loss=0.132, lr=0.0001] \nSteps: 71%|███████▏ | 500/700 [03:37<01:26, 2.31it/s, loss=0.132, lr=0.0001]\nSteps: 71%|███████▏ | 500/700 [03:37<01:26, 2.31it/s, loss=0.119, lr=0.0001]\nSteps: 72%|███████▏ | 501/700 [03:37<01:26, 2.31it/s, loss=0.119, lr=0.0001]\nSteps: 72%|███████▏ | 501/700 [03:37<01:26, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 72%|███████▏ | 502/700 [03:37<01:25, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 72%|███████▏ | 502/700 [03:37<01:25, 2.31it/s, loss=0.128, lr=0.0001]\nSteps: 72%|███████▏ | 503/700 [03:38<01:25, 2.31it/s, loss=0.128, lr=0.0001]\nSteps: 72%|███████▏ | 503/700 [03:38<01:25, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 72%|███████▏ | 504/700 [03:38<01:24, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 72%|███████▏ | 504/700 [03:38<01:24, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 72%|███████▏ | 505/700 [03:39<01:24, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 72%|███████▏ | 505/700 [03:39<01:24, 2.31it/s, loss=0.108, lr=0.0001]\nSteps: 72%|███████▏ | 506/700 [03:39<01:24, 2.31it/s, loss=0.108, lr=0.0001]\nSteps: 72%|███████▏ | 506/700 [03:39<01:24, 2.31it/s, loss=0.06, lr=0.0001] \nSteps: 72%|███████▏ | 507/700 [03:40<01:23, 2.31it/s, loss=0.06, lr=0.0001]\nSteps: 72%|███████▏ | 507/700 [03:40<01:23, 2.31it/s, loss=0.144, lr=0.0001]\nSteps: 73%|███████▎ | 508/700 [03:40<01:23, 2.31it/s, loss=0.144, lr=0.0001]\nSteps: 73%|███████▎ | 508/700 [03:40<01:23, 2.31it/s, loss=0.0841, lr=0.0001]\nSteps: 73%|███████▎ | 509/700 [03:40<01:22, 2.31it/s, loss=0.0841, lr=0.0001]\nSteps: 73%|███████▎ | 509/700 [03:41<01:22, 2.31it/s, loss=0.104, lr=0.0001] \nSteps: 73%|███████▎ | 510/700 [03:41<01:22, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 73%|███████▎ | 510/700 [03:41<01:22, 2.31it/s, loss=0.0856, lr=0.0001]\nSteps: 73%|███████▎ | 511/700 [03:41<01:21, 2.31it/s, loss=0.0856, lr=0.0001]\nSteps: 73%|███████▎ | 511/700 [03:41<01:21, 2.31it/s, loss=0.16, lr=0.0001] \nSteps: 73%|███████▎ | 512/700 [03:42<01:21, 2.31it/s, loss=0.16, lr=0.0001]\nSteps: 73%|███████▎ | 512/700 [03:42<01:21, 2.31it/s, loss=0.0192, lr=0.0001]\nSteps: 73%|███████▎ | 513/700 [03:42<01:21, 2.30it/s, loss=0.0192, lr=0.0001]\nSteps: 73%|███████▎ | 513/700 [03:42<01:21, 2.30it/s, loss=0.0949, lr=0.0001]\nSteps: 73%|███████▎ | 514/700 [03:43<01:20, 2.30it/s, loss=0.0949, lr=0.0001]\nSteps: 73%|███████▎ | 514/700 [03:43<01:20, 2.30it/s, loss=0.223, lr=0.0001] \nSteps: 74%|███████▎ | 515/700 [03:43<01:20, 2.30it/s, loss=0.223, lr=0.0001]\nSteps: 74%|███████▎ | 515/700 [03:43<01:20, 2.30it/s, loss=0.164, lr=0.0001]\nSteps: 74%|███████▎ | 516/700 [03:44<01:19, 2.31it/s, loss=0.164, lr=0.0001]\nSteps: 74%|███████▎ | 516/700 [03:44<01:19, 2.31it/s, loss=0.0825, lr=0.0001]\nSteps: 74%|███████▍ | 517/700 [03:44<01:19, 2.31it/s, loss=0.0825, lr=0.0001]\nSteps: 74%|███████▍ | 517/700 [03:44<01:19, 2.31it/s, loss=0.133, lr=0.0001] \nSteps: 74%|███████▍ | 518/700 [03:44<01:18, 2.31it/s, loss=0.133, lr=0.0001]\nSteps: 74%|███████▍ | 518/700 [03:44<01:18, 2.31it/s, loss=0.0874, lr=0.0001]\nSteps: 74%|███████▍ | 519/700 [03:45<01:18, 2.31it/s, loss=0.0874, lr=0.0001]\nSteps: 74%|███████▍ | 519/700 [03:45<01:18, 2.31it/s, loss=0.162, lr=0.0001] \nSteps: 74%|███████▍ | 520/700 [03:45<01:18, 2.30it/s, loss=0.162, lr=0.0001]\nSteps: 74%|███████▍ | 520/700 [03:45<01:18, 2.30it/s, loss=0.102, lr=0.0001]\nSteps: 74%|███████▍ | 521/700 [03:46<01:17, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 74%|███████▍ | 521/700 [03:46<01:17, 2.31it/s, loss=0.145, lr=0.0001]\nSteps: 75%|███████▍ | 522/700 [03:46<01:17, 2.31it/s, loss=0.145, lr=0.0001]\nSteps: 75%|███████▍ | 522/700 [03:46<01:17, 2.31it/s, loss=0.0441, lr=0.0001]\nSteps: 75%|███████▍ | 523/700 [03:47<01:16, 2.31it/s, loss=0.0441, lr=0.0001]\nSteps: 75%|███████▍ | 523/700 [03:47<01:16, 2.31it/s, loss=0.119, lr=0.0001] \nSteps: 75%|███████▍ | 524/700 [03:47<01:16, 2.31it/s, loss=0.119, lr=0.0001]\nSteps: 75%|███████▍ | 524/700 [03:47<01:16, 2.31it/s, loss=0.0832, lr=0.0001]\nSteps: 75%|███████▌ | 525/700 [03:47<01:16, 2.30it/s, loss=0.0832, lr=0.0001]\nSteps: 75%|███████▌ | 525/700 [03:47<01:16, 2.30it/s, loss=0.136, lr=0.0001] \nSteps: 75%|███████▌ | 526/700 [03:48<01:15, 2.30it/s, loss=0.136, lr=0.0001]\nSteps: 75%|███████▌ | 526/700 [03:48<01:15, 2.30it/s, loss=0.124, lr=0.0001]\nSteps: 75%|███████▌ | 527/700 [03:48<01:15, 2.30it/s, loss=0.124, lr=0.0001]\nSteps: 75%|███████▌ | 527/700 [03:48<01:15, 2.30it/s, loss=0.0421, lr=0.0001]\nSteps: 75%|███████▌ | 528/700 [03:49<01:14, 2.31it/s, loss=0.0421, lr=0.0001]\nSteps: 75%|███████▌ | 528/700 [03:49<01:14, 2.31it/s, loss=0.0114, lr=0.0001]\nSteps: 76%|███████▌ | 529/700 [03:49<01:14, 2.30it/s, loss=0.0114, lr=0.0001]\nSteps: 76%|███████▌ | 529/700 [03:49<01:14, 2.30it/s, loss=0.134, lr=0.0001] \nSteps: 76%|███████▌ | 530/700 [03:50<01:13, 2.30it/s, loss=0.134, lr=0.0001]\nSteps: 76%|███████▌ | 530/700 [03:50<01:13, 2.30it/s, loss=0.0501, lr=0.0001]\nSteps: 76%|███████▌ | 531/700 [03:50<01:13, 2.30it/s, loss=0.0501, lr=0.0001]\nSteps: 76%|███████▌ | 531/700 [03:50<01:13, 2.30it/s, loss=0.0874, lr=0.0001]\nSteps: 76%|███████▌ | 532/700 [03:50<01:12, 2.31it/s, loss=0.0874, lr=0.0001]\nSteps: 76%|███████▌ | 532/700 [03:51<01:12, 2.31it/s, loss=0.0677, lr=0.0001]\nSteps: 76%|███████▌ | 533/700 [03:51<01:12, 2.31it/s, loss=0.0677, lr=0.0001]\nSteps: 76%|███████▌ | 533/700 [03:51<01:12, 2.31it/s, loss=0.299, lr=0.0001] \nSteps: 76%|███████▋ | 534/700 [03:51<01:12, 2.30it/s, loss=0.299, lr=0.0001]\nSteps: 76%|███████▋ | 534/700 [03:51<01:12, 2.30it/s, loss=0.12, lr=0.0001] \nSteps: 76%|███████▋ | 535/700 [03:52<01:11, 2.31it/s, loss=0.12, lr=0.0001]\nSteps: 76%|███████▋ | 535/700 [03:52<01:11, 2.31it/s, loss=0.279, lr=0.0001]\nSteps: 77%|███████▋ | 536/700 [03:52<01:11, 2.31it/s, loss=0.279, lr=0.0001]\nSteps: 77%|███████▋ | 536/700 [03:52<01:11, 2.31it/s, loss=0.109, lr=0.0001]\nSteps: 77%|███████▋ | 537/700 [03:53<01:10, 2.31it/s, loss=0.109, lr=0.0001]\nSteps: 77%|███████▋ | 537/700 [03:53<01:10, 2.31it/s, loss=0.0592, lr=0.0001]\nSteps: 77%|███████▋ | 538/700 [03:53<01:10, 2.31it/s, loss=0.0592, lr=0.0001]\nSteps: 77%|███████▋ | 538/700 [03:53<01:10, 2.31it/s, loss=0.101, lr=0.0001] \nSteps: 77%|███████▋ | 539/700 [03:54<01:09, 2.30it/s, loss=0.101, lr=0.0001]\nSteps: 77%|███████▋ | 539/700 [03:54<01:09, 2.30it/s, loss=0.0438, lr=0.0001]\nSteps: 77%|███████▋ | 540/700 [03:54<01:09, 2.30it/s, loss=0.0438, lr=0.0001]\nSteps: 77%|███████▋ | 540/700 [03:54<01:09, 2.30it/s, loss=0.101, lr=0.0001] \nSteps: 77%|███████▋ | 541/700 [03:54<01:09, 2.30it/s, loss=0.101, lr=0.0001]\nSteps: 77%|███████▋ | 541/700 [03:54<01:09, 2.30it/s, loss=0.139, lr=0.0001]\nSteps: 77%|███████▋ | 542/700 [03:55<01:08, 2.30it/s, loss=0.139, lr=0.0001]\nSteps: 77%|███████▋ | 542/700 [03:55<01:08, 2.30it/s, loss=0.198, lr=0.0001]\nSteps: 78%|███████▊ | 543/700 [03:55<01:08, 2.30it/s, loss=0.198, lr=0.0001]\nSteps: 78%|███████▊ | 543/700 [03:55<01:08, 2.30it/s, loss=0.171, lr=0.0001]\nSteps: 78%|███████▊ | 544/700 [03:56<01:07, 2.31it/s, loss=0.171, lr=0.0001]\nSteps: 78%|███████▊ | 544/700 [03:56<01:07, 2.31it/s, loss=0.11, lr=0.0001] \nSteps: 78%|███████▊ | 545/700 [03:56<01:07, 2.30it/s, loss=0.11, lr=0.0001]\nSteps: 78%|███████▊ | 545/700 [03:56<01:07, 2.30it/s, loss=0.117, lr=0.0001]\nSteps: 78%|███████▊ | 546/700 [03:57<01:06, 2.30it/s, loss=0.117, lr=0.0001]\nSteps: 78%|███████▊ | 546/700 [03:57<01:06, 2.30it/s, loss=0.0327, lr=0.0001]\nSteps: 78%|███████▊ | 547/700 [03:57<01:06, 2.30it/s, loss=0.0327, lr=0.0001]\nSteps: 78%|███████▊ | 547/700 [03:57<01:06, 2.30it/s, loss=0.0536, lr=0.0001]\nSteps: 78%|███████▊ | 548/700 [03:57<01:05, 2.31it/s, loss=0.0536, lr=0.0001]\nSteps: 78%|███████▊ | 548/700 [03:57<01:05, 2.31it/s, loss=0.1, lr=0.0001] \nSteps: 78%|███████▊ | 549/700 [03:58<01:05, 2.31it/s, loss=0.1, lr=0.0001]\nSteps: 78%|███████▊ | 549/700 [03:58<01:05, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 79%|███████▊ | 550/700 [03:58<01:04, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 79%|███████▊ | 550/700 [03:58<01:04, 2.31it/s, loss=0.0923, lr=0.0001]\nSteps: 79%|███████▊ | 551/700 [03:59<01:04, 2.31it/s, loss=0.0923, lr=0.0001]\nSteps: 79%|███████▊ | 551/700 [03:59<01:04, 2.31it/s, loss=0.13, lr=0.0001] \nSteps: 79%|███████▉ | 552/700 [03:59<01:04, 2.31it/s, loss=0.13, lr=0.0001]\nSteps: 79%|███████▉ | 552/700 [03:59<01:04, 2.31it/s, loss=0.0919, lr=0.0001]\nSteps: 79%|███████▉ | 553/700 [04:00<01:03, 2.31it/s, loss=0.0919, lr=0.0001]\nSteps: 79%|███████▉ | 553/700 [04:00<01:03, 2.31it/s, loss=0.125, lr=0.0001] \nSteps: 79%|███████▉ | 554/700 [04:00<01:03, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 79%|███████▉ | 554/700 [04:00<01:03, 2.31it/s, loss=0.0459, lr=0.0001]\nSteps: 79%|███████▉ | 555/700 [04:00<01:02, 2.31it/s, loss=0.0459, lr=0.0001]\nSteps: 79%|███████▉ | 555/700 [04:00<01:02, 2.31it/s, loss=0.178, lr=0.0001] \nSteps: 79%|███████▉ | 556/700 [04:01<01:02, 2.31it/s, loss=0.178, lr=0.0001]\nSteps: 79%|███████▉ | 556/700 [04:01<01:02, 2.31it/s, loss=0.0118, lr=0.0001]\nSteps: 80%|███████▉ | 557/700 [04:01<01:01, 2.31it/s, loss=0.0118, lr=0.0001]\nSteps: 80%|███████▉ | 557/700 [04:01<01:01, 2.31it/s, loss=0.105, lr=0.0001] \nSteps: 80%|███████▉ | 558/700 [04:02<01:01, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 80%|███████▉ | 558/700 [04:02<01:01, 2.31it/s, loss=0.141, lr=0.0001]\nSteps: 80%|███████▉ | 559/700 [04:02<01:01, 2.31it/s, loss=0.141, lr=0.0001]\nSteps: 80%|███████▉ | 559/700 [04:02<01:01, 2.31it/s, loss=0.135, lr=0.0001]\nSteps: 80%|████████ | 560/700 [04:03<01:00, 2.31it/s, loss=0.135, lr=0.0001]\nSteps: 80%|████████ | 560/700 [04:03<01:00, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 80%|████████ | 561/700 [04:03<01:00, 2.30it/s, loss=0.118, lr=0.0001]\nSteps: 80%|████████ | 561/700 [04:03<01:00, 2.30it/s, loss=0.162, lr=0.0001]\nSteps: 80%|████████ | 562/700 [04:03<00:59, 2.30it/s, loss=0.162, lr=0.0001]\nSteps: 80%|████████ | 562/700 [04:04<00:59, 2.30it/s, loss=0.0823, lr=0.0001]\nSteps: 80%|████████ | 563/700 [04:04<00:59, 2.30it/s, loss=0.0823, lr=0.0001]\nSteps: 80%|████████ | 563/700 [04:04<00:59, 2.30it/s, loss=0.182, lr=0.0001] \nSteps: 81%|████████ | 564/700 [04:04<00:59, 2.30it/s, loss=0.182, lr=0.0001]\nSteps: 81%|████████ | 564/700 [04:04<00:59, 2.30it/s, loss=0.118, lr=0.0001]\nSteps: 81%|████████ | 565/700 [04:05<00:58, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 81%|████████ | 565/700 [04:05<00:58, 2.31it/s, loss=0.0902, lr=0.0001]\nSteps: 81%|████████ | 566/700 [04:05<00:58, 2.31it/s, loss=0.0902, lr=0.0001]\nSteps: 81%|████████ | 566/700 [04:05<00:58, 2.31it/s, loss=0.0953, lr=0.0001]\nSteps: 81%|████████ | 567/700 [04:06<00:57, 2.31it/s, loss=0.0953, lr=0.0001]\nSteps: 81%|████████ | 567/700 [04:06<00:57, 2.31it/s, loss=0.126, lr=0.0001] \nSteps: 81%|████████ | 568/700 [04:06<00:57, 2.31it/s, loss=0.126, lr=0.0001]\nSteps: 81%|████████ | 568/700 [04:06<00:57, 2.31it/s, loss=0.0431, lr=0.0001]\nSteps: 81%|████████▏ | 569/700 [04:07<00:56, 2.31it/s, loss=0.0431, lr=0.0001]\nSteps: 81%|████████▏ | 569/700 [04:07<00:56, 2.31it/s, loss=0.0227, lr=0.0001]\nSteps: 81%|████████▏ | 570/700 [04:07<00:56, 2.31it/s, loss=0.0227, lr=0.0001]\nSteps: 81%|████████▏ | 570/700 [04:07<00:56, 2.31it/s, loss=0.192, lr=0.0001] \nSteps: 82%|████████▏ | 571/700 [04:07<00:55, 2.31it/s, loss=0.192, lr=0.0001]\nSteps: 82%|████████▏ | 571/700 [04:07<00:55, 2.31it/s, loss=0.189, lr=0.0001]\nSteps: 82%|████████▏ | 572/700 [04:08<00:55, 2.31it/s, loss=0.189, lr=0.0001]\nSteps: 82%|████████▏ | 572/700 [04:08<00:55, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 82%|████████▏ | 573/700 [04:08<00:55, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 82%|████████▏ | 573/700 [04:08<00:55, 2.31it/s, loss=0.156, lr=0.0001]\nSteps: 82%|████████▏ | 574/700 [04:09<00:54, 2.31it/s, loss=0.156, lr=0.0001]\nSteps: 82%|████████▏ | 574/700 [04:09<00:54, 2.31it/s, loss=0.133, lr=0.0001]\nSteps: 82%|████████▏ | 575/700 [04:09<00:54, 2.31it/s, loss=0.133, lr=0.0001]\nSteps: 82%|████████▏ | 575/700 [04:09<00:54, 2.31it/s, loss=0.0888, lr=0.0001]\nSteps: 82%|████████▏ | 576/700 [04:10<00:53, 2.31it/s, loss=0.0888, lr=0.0001]\nSteps: 82%|████████▏ | 576/700 [04:10<00:53, 2.31it/s, loss=0.128, lr=0.0001] \nSteps: 82%|████████▏ | 577/700 [04:10<00:53, 2.30it/s, loss=0.128, lr=0.0001]\nSteps: 82%|████████▏ | 577/700 [04:10<00:53, 2.30it/s, loss=0.154, lr=0.0001]\nSteps: 83%|████████▎ | 578/700 [04:10<00:53, 2.30it/s, loss=0.154, lr=0.0001]\nSteps: 83%|████████▎ | 578/700 [04:10<00:53, 2.30it/s, loss=0.062, lr=0.0001]\nSteps: 83%|████████▎ | 579/700 [04:11<00:52, 2.30it/s, loss=0.062, lr=0.0001]\nSteps: 83%|████████▎ | 579/700 [04:11<00:52, 2.30it/s, loss=0.11, lr=0.0001] \nSteps: 83%|████████▎ | 580/700 [04:11<00:52, 2.31it/s, loss=0.11, lr=0.0001]\nSteps: 83%|████████▎ | 580/700 [04:11<00:52, 2.31it/s, loss=0.0333, lr=0.0001]\nSteps: 83%|████████▎ | 581/700 [04:12<00:51, 2.31it/s, loss=0.0333, lr=0.0001]\nSteps: 83%|████████▎ | 581/700 [04:12<00:51, 2.31it/s, loss=0.0944, lr=0.0001]\nSteps: 83%|████████▎ | 582/700 [04:12<00:51, 2.31it/s, loss=0.0944, lr=0.0001]\nSteps: 83%|████████▎ | 582/700 [04:12<00:51, 2.31it/s, loss=0.106, lr=0.0001] \nSteps: 83%|████████▎ | 583/700 [04:13<00:50, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 83%|████████▎ | 583/700 [04:13<00:50, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 83%|████████▎ | 584/700 [04:13<00:50, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 83%|████████▎ | 584/700 [04:13<00:50, 2.31it/s, loss=0.0806, lr=0.0001]\nSteps: 84%|████████▎ | 585/700 [04:13<00:49, 2.31it/s, loss=0.0806, lr=0.0001]\nSteps: 84%|████████▎ | 585/700 [04:13<00:49, 2.31it/s, loss=0.157, lr=0.0001] \nSteps: 84%|████████▎ | 586/700 [04:14<00:49, 2.31it/s, loss=0.157, lr=0.0001]\nSteps: 84%|████████▎ | 586/700 [04:14<00:49, 2.31it/s, loss=0.0135, lr=0.0001]\nSteps: 84%|████████▍ | 587/700 [04:14<00:48, 2.31it/s, loss=0.0135, lr=0.0001]\nSteps: 84%|████████▍ | 587/700 [04:14<00:48, 2.31it/s, loss=0.244, lr=0.0001] \nSteps: 84%|████████▍ | 588/700 [04:15<00:48, 2.31it/s, loss=0.244, lr=0.0001]\nSteps: 84%|████████▍ | 588/700 [04:15<00:48, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 84%|████████▍ | 589/700 [04:15<00:48, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 84%|████████▍ | 589/700 [04:15<00:48, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 84%|████████▍ | 590/700 [04:16<00:47, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 84%|████████▍ | 590/700 [04:16<00:47, 2.31it/s, loss=0.128, lr=0.0001]\nSteps: 84%|████████▍ | 591/700 [04:16<00:47, 2.31it/s, loss=0.128, lr=0.0001]\nSteps: 84%|████████▍ | 591/700 [04:16<00:47, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 85%|████████▍ | 592/700 [04:16<00:46, 2.31it/s, loss=0.148, lr=0.0001]\nSteps: 85%|████████▍ | 592/700 [04:17<00:46, 2.31it/s, loss=0.278, lr=0.0001]\nSteps: 85%|████████▍ | 593/700 [04:17<00:46, 2.30it/s, loss=0.278, lr=0.0001]\nSteps: 85%|████████▍ | 593/700 [04:17<00:46, 2.30it/s, loss=0.134, lr=0.0001]\nSteps: 85%|████████▍ | 594/700 [04:17<00:46, 2.30it/s, loss=0.134, lr=0.0001]\nSteps: 85%|████████▍ | 594/700 [04:17<00:46, 2.30it/s, loss=0.0929, lr=0.0001]\nSteps: 85%|████████▌ | 595/700 [04:18<00:45, 2.30it/s, loss=0.0929, lr=0.0001]\nSteps: 85%|████████▌ | 595/700 [04:18<00:45, 2.30it/s, loss=0.102, lr=0.0001] \nSteps: 85%|████████▌ | 596/700 [04:18<00:45, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 85%|████████▌ | 596/700 [04:18<00:45, 2.31it/s, loss=0.0314, lr=0.0001]\nSteps: 85%|████████▌ | 597/700 [04:19<00:44, 2.31it/s, loss=0.0314, lr=0.0001]\nSteps: 85%|████████▌ | 597/700 [04:19<00:44, 2.31it/s, loss=0.15, lr=0.0001] \nSteps: 85%|████████▌ | 598/700 [04:19<00:44, 2.31it/s, loss=0.15, lr=0.0001]\nSteps: 85%|████████▌ | 598/700 [04:19<00:44, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 86%|████████▌ | 599/700 [04:20<00:43, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 86%|████████▌ | 599/700 [04:20<00:43, 2.31it/s, loss=0.0743, lr=0.0001]\nSteps: 86%|████████▌ | 600/700 [04:20<00:43, 2.31it/s, loss=0.0743, lr=0.0001]\nSteps: 86%|████████▌ | 600/700 [04:20<00:43, 2.31it/s, loss=0.128, lr=0.0001] \nSteps: 86%|████████▌ | 601/700 [04:20<00:42, 2.31it/s, loss=0.128, lr=0.0001]\nSteps: 86%|████████▌ | 601/700 [04:20<00:42, 2.31it/s, loss=0.123, lr=0.0001]\nSteps: 86%|████████▌ | 602/700 [04:21<00:42, 2.31it/s, loss=0.123, lr=0.0001]\nSteps: 86%|████████▌ | 602/700 [04:21<00:42, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 86%|████████▌ | 603/700 [04:21<00:41, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 86%|████████▌ | 603/700 [04:21<00:41, 2.31it/s, loss=0.071, lr=0.0001]\nSteps: 86%|████████▋ | 604/700 [04:22<00:41, 2.31it/s, loss=0.071, lr=0.0001]\nSteps: 86%|████████▋ | 604/700 [04:22<00:41, 2.31it/s, loss=0.255, lr=0.0001]\nSteps: 86%|████████▋ | 605/700 [04:22<00:41, 2.31it/s, loss=0.255, lr=0.0001]\nSteps: 86%|████████▋ | 605/700 [04:22<00:41, 2.31it/s, loss=0.069, lr=0.0001]\nSteps: 87%|████████▋ | 606/700 [04:23<00:40, 2.31it/s, loss=0.069, lr=0.0001]\nSteps: 87%|████████▋ | 606/700 [04:23<00:40, 2.31it/s, loss=0.127, lr=0.0001]\nSteps: 87%|████████▋ | 607/700 [04:23<00:40, 2.31it/s, loss=0.127, lr=0.0001]\nSteps: 87%|████████▋ | 607/700 [04:23<00:40, 2.31it/s, loss=0.176, lr=0.0001]\nSteps: 87%|████████▋ | 608/700 [04:23<00:39, 2.31it/s, loss=0.176, lr=0.0001]\nSteps: 87%|████████▋ | 608/700 [04:23<00:39, 2.31it/s, loss=0.131, lr=0.0001]\nSteps: 87%|████████▋ | 609/700 [04:24<00:39, 2.29it/s, loss=0.131, lr=0.0001]\nSteps: 87%|████████▋ | 609/700 [04:24<00:39, 2.29it/s, loss=0.265, lr=0.0001]\nSteps: 87%|████████▋ | 610/700 [04:24<00:39, 2.30it/s, loss=0.265, lr=0.0001]\nSteps: 87%|████████▋ | 610/700 [04:24<00:39, 2.30it/s, loss=0.19, lr=0.0001] \nSteps: 87%|████████▋ | 611/700 [04:25<00:38, 2.30it/s, loss=0.19, lr=0.0001]\nSteps: 87%|████████▋ | 611/700 [04:25<00:38, 2.30it/s, loss=0.143, lr=0.0001]\nSteps: 87%|████████▋ | 612/700 [04:25<00:38, 2.30it/s, loss=0.143, lr=0.0001]\nSteps: 87%|████████▋ | 612/700 [04:25<00:38, 2.30it/s, loss=0.11, lr=0.0001] \nSteps: 88%|████████▊ | 613/700 [04:26<00:37, 2.31it/s, loss=0.11, lr=0.0001]\nSteps: 88%|████████▊ | 613/700 [04:26<00:37, 2.31it/s, loss=0.327, lr=0.0001]\nSteps: 88%|████████▊ | 614/700 [04:26<00:37, 2.31it/s, loss=0.327, lr=0.0001]\nSteps: 88%|████████▊ | 614/700 [04:26<00:37, 2.31it/s, loss=0.127, lr=0.0001]\nSteps: 88%|████████▊ | 615/700 [04:26<00:36, 2.31it/s, loss=0.127, lr=0.0001]\nSteps: 88%|████████▊ | 615/700 [04:26<00:36, 2.31it/s, loss=0.0661, lr=0.0001]\nSteps: 88%|████████▊ | 616/700 [04:27<00:36, 2.31it/s, loss=0.0661, lr=0.0001]\nSteps: 88%|████████▊ | 616/700 [04:27<00:36, 2.31it/s, loss=0.0279, lr=0.0001]\nSteps: 88%|████████▊ | 617/700 [04:27<00:35, 2.31it/s, loss=0.0279, lr=0.0001]\nSteps: 88%|████████▊ | 617/700 [04:27<00:35, 2.31it/s, loss=0.0887, lr=0.0001]\nSteps: 88%|████████▊ | 618/700 [04:28<00:35, 2.31it/s, loss=0.0887, lr=0.0001]\nSteps: 88%|████████▊ | 618/700 [04:28<00:35, 2.31it/s, loss=0.222, lr=0.0001] \nSteps: 88%|████████▊ | 619/700 [04:28<00:35, 2.31it/s, loss=0.222, lr=0.0001]\nSteps: 88%|████████▊ | 619/700 [04:28<00:35, 2.31it/s, loss=0.253, lr=0.0001]\nSteps: 89%|████████▊ | 620/700 [04:29<00:34, 2.31it/s, loss=0.253, lr=0.0001]\nSteps: 89%|████████▊ | 620/700 [04:29<00:34, 2.31it/s, loss=0.0884, lr=0.0001]\nSteps: 89%|████████▊ | 621/700 [04:29<00:34, 2.30it/s, loss=0.0884, lr=0.0001]\nSteps: 89%|████████▊ | 621/700 [04:29<00:34, 2.30it/s, loss=0.0895, lr=0.0001]\nSteps: 89%|████████▉ | 622/700 [04:29<00:33, 2.31it/s, loss=0.0895, lr=0.0001]\nSteps: 89%|████████▉ | 622/700 [04:30<00:33, 2.31it/s, loss=0.113, lr=0.0001] \nSteps: 89%|████████▉ | 623/700 [04:30<00:33, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 89%|████████▉ | 623/700 [04:30<00:33, 2.31it/s, loss=0.0678, lr=0.0001]\nSteps: 89%|████████▉ | 624/700 [04:30<00:32, 2.31it/s, loss=0.0678, lr=0.0001]\nSteps: 89%|████████▉ | 624/700 [04:30<00:32, 2.31it/s, loss=0.147, lr=0.0001] \nSteps: 89%|████████▉ | 625/700 [04:31<00:32, 2.30it/s, loss=0.147, lr=0.0001]\nSteps: 89%|████████▉ | 625/700 [04:31<00:32, 2.30it/s, loss=0.087, lr=0.0001]\nSteps: 89%|████████▉ | 626/700 [04:31<00:32, 2.30it/s, loss=0.087, lr=0.0001]\nSteps: 89%|████████▉ | 626/700 [04:31<00:32, 2.30it/s, loss=0.0731, lr=0.0001]\nSteps: 90%|████████▉ | 627/700 [04:32<00:31, 2.30it/s, loss=0.0731, lr=0.0001]\nSteps: 90%|████████▉ | 627/700 [04:32<00:31, 2.30it/s, loss=0.137, lr=0.0001] \nSteps: 90%|████████▉ | 628/700 [04:32<00:31, 2.31it/s, loss=0.137, lr=0.0001]\nSteps: 90%|████████▉ | 628/700 [04:32<00:31, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 90%|████████▉ | 629/700 [04:33<00:30, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 90%|████████▉ | 629/700 [04:33<00:30, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 90%|█████████ | 630/700 [04:33<00:30, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 90%|█████████ | 630/700 [04:33<00:30, 2.31it/s, loss=0.276, lr=0.0001]\nSteps: 90%|█████████ | 631/700 [04:33<00:29, 2.31it/s, loss=0.276, lr=0.0001]\nSteps: 90%|█████████ | 631/700 [04:33<00:29, 2.31it/s, loss=0.12, lr=0.0001] \nSteps: 90%|█████████ | 632/700 [04:34<00:29, 2.31it/s, loss=0.12, lr=0.0001]\nSteps: 90%|█████████ | 632/700 [04:34<00:29, 2.31it/s, loss=0.171, lr=0.0001]\nSteps: 90%|█████████ | 633/700 [04:34<00:28, 2.31it/s, loss=0.171, lr=0.0001]\nSteps: 90%|█████████ | 633/700 [04:34<00:28, 2.31it/s, loss=0.0859, lr=0.0001]\nSteps: 91%|█████████ | 634/700 [04:35<00:28, 2.31it/s, loss=0.0859, lr=0.0001]\nSteps: 91%|█████████ | 634/700 [04:35<00:28, 2.31it/s, loss=0.0891, lr=0.0001]\nSteps: 91%|█████████ | 635/700 [04:35<00:28, 2.31it/s, loss=0.0891, lr=0.0001]\nSteps: 91%|█████████ | 635/700 [04:35<00:28, 2.31it/s, loss=0.122, lr=0.0001] \nSteps: 91%|█████████ | 636/700 [04:36<00:27, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 91%|█████████ | 636/700 [04:36<00:27, 2.31it/s, loss=0.147, lr=0.0001]\nSteps: 91%|█████████ | 637/700 [04:36<00:27, 2.31it/s, loss=0.147, lr=0.0001]\nSteps: 91%|█████████ | 637/700 [04:36<00:27, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 91%|█████████ | 638/700 [04:36<00:26, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 91%|█████████ | 638/700 [04:36<00:26, 2.31it/s, loss=0.212, lr=0.0001]\nSteps: 91%|█████████▏| 639/700 [04:37<00:26, 2.31it/s, loss=0.212, lr=0.0001]\nSteps: 91%|█████████▏| 639/700 [04:37<00:26, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 91%|█████████▏| 640/700 [04:37<00:25, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 91%|█████████▏| 640/700 [04:37<00:25, 2.31it/s, loss=0.222, lr=0.0001]\nSteps: 92%|█████████▏| 641/700 [04:38<00:25, 2.30it/s, loss=0.222, lr=0.0001]\nSteps: 92%|█████████▏| 641/700 [04:38<00:25, 2.30it/s, loss=0.145, lr=0.0001]\nSteps: 92%|█████████▏| 642/700 [04:38<00:25, 2.30it/s, loss=0.145, lr=0.0001]\nSteps: 92%|█████████▏| 642/700 [04:38<00:25, 2.30it/s, loss=0.0954, lr=0.0001]\nSteps: 92%|█████████▏| 643/700 [04:39<00:24, 2.31it/s, loss=0.0954, lr=0.0001]\nSteps: 92%|█████████▏| 643/700 [04:39<00:24, 2.31it/s, loss=0.288, lr=0.0001] \nSteps: 92%|█████████▏| 644/700 [04:39<00:24, 2.31it/s, loss=0.288, lr=0.0001]\nSteps: 92%|█████████▏| 644/700 [04:39<00:24, 2.31it/s, loss=0.115, lr=0.0001]\nSteps: 92%|█████████▏| 645/700 [04:39<00:23, 2.31it/s, loss=0.115, lr=0.0001]\nSteps: 92%|█████████▏| 645/700 [04:39<00:23, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 92%|█████████▏| 646/700 [04:40<00:23, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 92%|█████████▏| 646/700 [04:40<00:23, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 92%|█████████▏| 647/700 [04:40<00:22, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 92%|█████████▏| 647/700 [04:40<00:22, 2.31it/s, loss=0.16, lr=0.0001] \nSteps: 93%|█████████▎| 648/700 [04:41<00:22, 2.31it/s, loss=0.16, lr=0.0001]\nSteps: 93%|█████████▎| 648/700 [04:41<00:22, 2.31it/s, loss=0.08, lr=0.0001]\nSteps: 93%|█████████▎| 649/700 [04:41<00:22, 2.31it/s, loss=0.08, lr=0.0001]\nSteps: 93%|█████████▎| 649/700 [04:41<00:22, 2.31it/s, loss=0.145, lr=0.0001]\nSteps: 93%|█████████▎| 650/700 [04:42<00:21, 2.31it/s, loss=0.145, lr=0.0001]\nSteps: 93%|█████████▎| 650/700 [04:42<00:21, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 93%|█████████▎| 651/700 [04:42<00:21, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 93%|█████████▎| 651/700 [04:42<00:21, 2.31it/s, loss=0.142, lr=0.0001]\nSteps: 93%|█████████▎| 652/700 [04:42<00:20, 2.31it/s, loss=0.142, lr=0.0001]\nSteps: 93%|█████████▎| 652/700 [04:43<00:20, 2.31it/s, loss=0.177, lr=0.0001]\nSteps: 93%|█████████▎| 653/700 [04:43<00:20, 2.31it/s, loss=0.177, lr=0.0001]\nSteps: 93%|█████████▎| 653/700 [04:43<00:20, 2.31it/s, loss=0.0607, lr=0.0001]\nSteps: 93%|█████████▎| 654/700 [04:43<00:19, 2.31it/s, loss=0.0607, lr=0.0001]\nSteps: 93%|█████████▎| 654/700 [04:43<00:19, 2.31it/s, loss=0.131, lr=0.0001] \nSteps: 94%|█████████▎| 655/700 [04:44<00:19, 2.31it/s, loss=0.131, lr=0.0001]\nSteps: 94%|█████████▎| 655/700 [04:44<00:19, 2.31it/s, loss=0.0542, lr=0.0001]\nSteps: 94%|█████████▎| 656/700 [04:44<00:19, 2.31it/s, loss=0.0542, lr=0.0001]\nSteps: 94%|█████████▎| 656/700 [04:44<00:19, 2.31it/s, loss=0.113, lr=0.0001] \nSteps: 94%|█████████▍| 657/700 [04:45<00:18, 2.30it/s, loss=0.113, lr=0.0001]\nSteps: 94%|█████████▍| 657/700 [04:45<00:18, 2.30it/s, loss=0.173, lr=0.0001]\nSteps: 94%|█████████▍| 658/700 [04:45<00:18, 2.30it/s, loss=0.173, lr=0.0001]\nSteps: 94%|█████████▍| 658/700 [04:45<00:18, 2.30it/s, loss=0.0329, lr=0.0001]\nSteps: 94%|█████████▍| 659/700 [04:46<00:17, 2.31it/s, loss=0.0329, lr=0.0001]\nSteps: 94%|█████████▍| 659/700 [04:46<00:17, 2.31it/s, loss=0.161, lr=0.0001] \nSteps: 94%|█████████▍| 660/700 [04:46<00:17, 2.31it/s, loss=0.161, lr=0.0001]\nSteps: 94%|█████████▍| 660/700 [04:46<00:17, 2.31it/s, loss=0.0519, lr=0.0001]\nSteps: 94%|█████████▍| 661/700 [04:46<00:16, 2.31it/s, loss=0.0519, lr=0.0001]\nSteps: 94%|█████████▍| 661/700 [04:46<00:16, 2.31it/s, loss=0.0884, lr=0.0001]\nSteps: 95%|█████████▍| 662/700 [04:47<00:16, 2.31it/s, loss=0.0884, lr=0.0001]\nSteps: 95%|█████████▍| 662/700 [04:47<00:16, 2.31it/s, loss=0.108, lr=0.0001] \nSteps: 95%|█████████▍| 663/700 [04:47<00:16, 2.31it/s, loss=0.108, lr=0.0001]\nSteps: 95%|█████████▍| 663/700 [04:47<00:16, 2.31it/s, loss=0.0557, lr=0.0001]\nSteps: 95%|█████████▍| 664/700 [04:48<00:15, 2.31it/s, loss=0.0557, lr=0.0001]\nSteps: 95%|█████████▍| 664/700 [04:48<00:15, 2.31it/s, loss=0.12, lr=0.0001] \nSteps: 95%|█████████▌| 665/700 [04:48<00:15, 2.31it/s, loss=0.12, lr=0.0001]\nSteps: 95%|█████████▌| 665/700 [04:48<00:15, 2.31it/s, loss=0.0976, lr=0.0001]\nSteps: 95%|█████████▌| 666/700 [04:49<00:14, 2.31it/s, loss=0.0976, lr=0.0001]\nSteps: 95%|█████████▌| 666/700 [04:49<00:14, 2.31it/s, loss=0.175, lr=0.0001] \nSteps: 95%|█████████▌| 667/700 [04:49<00:14, 2.31it/s, loss=0.175, lr=0.0001]\nSteps: 95%|█████████▌| 667/700 [04:49<00:14, 2.31it/s, loss=0.0758, lr=0.0001]\nSteps: 95%|█████████▌| 668/700 [04:49<00:13, 2.31it/s, loss=0.0758, lr=0.0001]\nSteps: 95%|█████████▌| 668/700 [04:49<00:13, 2.31it/s, loss=0.154, lr=0.0001] \nSteps: 96%|█████████▌| 669/700 [04:50<00:13, 2.31it/s, loss=0.154, lr=0.0001]\nSteps: 96%|█████████▌| 669/700 [04:50<00:13, 2.31it/s, loss=0.0661, lr=0.0001]\nSteps: 96%|█████████▌| 670/700 [04:50<00:12, 2.31it/s, loss=0.0661, lr=0.0001]\nSteps: 96%|█████████▌| 670/700 [04:50<00:12, 2.31it/s, loss=0.222, lr=0.0001] \nSteps: 96%|█████████▌| 671/700 [04:51<00:12, 2.31it/s, loss=0.222, lr=0.0001]\nSteps: 96%|█████████▌| 671/700 [04:51<00:12, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 96%|█████████▌| 672/700 [04:51<00:12, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 96%|█████████▌| 672/700 [04:51<00:12, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 96%|█████████▌| 673/700 [04:52<00:11, 2.30it/s, loss=0.117, lr=0.0001]\nSteps: 96%|█████████▌| 673/700 [04:52<00:11, 2.30it/s, loss=0.163, lr=0.0001]\nSteps: 96%|█████████▋| 674/700 [04:52<00:11, 2.30it/s, loss=0.163, lr=0.0001]\nSteps: 96%|█████████▋| 674/700 [04:52<00:11, 2.30it/s, loss=0.0756, lr=0.0001]\nSteps: 96%|█████████▋| 675/700 [04:52<00:10, 2.31it/s, loss=0.0756, lr=0.0001]\nSteps: 96%|█████████▋| 675/700 [04:52<00:10, 2.31it/s, loss=0.178, lr=0.0001] \nSteps: 97%|█████████▋| 676/700 [04:53<00:10, 2.31it/s, loss=0.178, lr=0.0001]\nSteps: 97%|█████████▋| 676/700 [04:53<00:10, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 97%|█████████▋| 677/700 [04:53<00:09, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 97%|█████████▋| 677/700 [04:53<00:09, 2.31it/s, loss=0.139, lr=0.0001]\nSteps: 97%|█████████▋| 678/700 [04:54<00:09, 2.31it/s, loss=0.139, lr=0.0001]\nSteps: 97%|█████████▋| 678/700 [04:54<00:09, 2.31it/s, loss=0.0792, lr=0.0001]\nSteps: 97%|█████████▋| 679/700 [04:54<00:09, 2.31it/s, loss=0.0792, lr=0.0001]\nSteps: 97%|█████████▋| 679/700 [04:54<00:09, 2.31it/s, loss=0.214, lr=0.0001] \nSteps: 97%|█████████▋| 680/700 [04:55<00:08, 2.31it/s, loss=0.214, lr=0.0001]\nSteps: 97%|█████████▋| 680/700 [04:55<00:08, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 97%|█████████▋| 681/700 [04:55<00:08, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 97%|█████████▋| 681/700 [04:55<00:08, 2.31it/s, loss=0.233, lr=0.0001]\nSteps: 97%|█████████▋| 682/700 [04:55<00:07, 2.31it/s, loss=0.233, lr=0.0001]\nSteps: 97%|█████████▋| 682/700 [04:56<00:07, 2.31it/s, loss=0.107, lr=0.0001]\nSteps: 98%|█████████▊| 683/700 [04:56<00:07, 2.31it/s, loss=0.107, lr=0.0001]\nSteps: 98%|█████████▊| 683/700 [04:56<00:07, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 98%|█████████▊| 684/700 [04:56<00:06, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 98%|█████████▊| 684/700 [04:56<00:06, 2.31it/s, loss=0.176, lr=0.0001]\nSteps: 98%|█████████▊| 685/700 [04:57<00:06, 2.31it/s, loss=0.176, lr=0.0001]\nSteps: 98%|█████████▊| 685/700 [04:57<00:06, 2.31it/s, loss=0.0955, lr=0.0001]\nSteps: 98%|█████████▊| 686/700 [04:57<00:06, 2.31it/s, loss=0.0955, lr=0.0001]\nSteps: 98%|█████████▊| 686/700 [04:57<00:06, 2.31it/s, loss=0.11, lr=0.0001] \nSteps: 98%|█████████▊| 687/700 [04:58<00:05, 2.31it/s, loss=0.11, lr=0.0001]\nSteps: 98%|█████████▊| 687/700 [04:58<00:05, 2.31it/s, loss=0.139, lr=0.0001]\nSteps: 98%|█████████▊| 688/700 [04:58<00:05, 2.31it/s, loss=0.139, lr=0.0001]\nSteps: 98%|█████████▊| 688/700 [04:58<00:05, 2.31it/s, loss=0.0515, lr=0.0001]\nSteps: 98%|█████████▊| 689/700 [04:59<00:04, 2.30it/s, loss=0.0515, lr=0.0001]\nSteps: 98%|█████████▊| 689/700 [04:59<00:04, 2.30it/s, loss=0.102, lr=0.0001] \nSteps: 99%|█████████▊| 690/700 [04:59<00:04, 2.30it/s, loss=0.102, lr=0.0001]\nSteps: 99%|█████████▊| 690/700 [04:59<00:04, 2.30it/s, loss=0.174, lr=0.0001]\nSteps: 99%|█████████▊| 691/700 [04:59<00:03, 2.31it/s, loss=0.174, lr=0.0001]\nSteps: 99%|█████████▊| 691/700 [04:59<00:03, 2.31it/s, loss=0.161, lr=0.0001]\nSteps: 99%|█████████▉| 692/700 [05:00<00:03, 2.31it/s, loss=0.161, lr=0.0001]\nSteps: 99%|█████████▉| 692/700 [05:00<00:03, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 99%|█████████▉| 693/700 [05:00<00:03, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 99%|█████████▉| 693/700 [05:00<00:03, 2.31it/s, loss=0.0503, lr=0.0001]\nSteps: 99%|█████████▉| 694/700 [05:01<00:02, 2.31it/s, loss=0.0503, lr=0.0001]\nSteps: 99%|█████████▉| 694/700 [05:01<00:02, 2.31it/s, loss=0.079, lr=0.0001] \nSteps: 99%|█████████▉| 695/700 [05:01<00:02, 2.31it/s, loss=0.079, lr=0.0001]\nSteps: 99%|█████████▉| 695/700 [05:01<00:02, 2.31it/s, loss=0.0907, lr=0.0001]\nSteps: 99%|█████████▉| 696/700 [05:02<00:01, 2.31it/s, loss=0.0907, lr=0.0001]\nSteps: 99%|█████████▉| 696/700 [05:02<00:01, 2.31it/s, loss=0.108, lr=0.0001] \nSteps: 100%|█████████▉| 697/700 [05:02<00:01, 2.31it/s, loss=0.108, lr=0.0001]\nSteps: 100%|█████████▉| 697/700 [05:02<00:01, 2.31it/s, loss=0.165, lr=0.0001]\nSteps: 100%|█████████▉| 698/700 [05:02<00:00, 2.31it/s, loss=0.165, lr=0.0001]\nSteps: 100%|█████████▉| 698/700 [05:02<00:00, 2.31it/s, loss=0.194, lr=0.0001]\nSteps: 100%|█████████▉| 699/700 [05:03<00:00, 2.31it/s, loss=0.194, lr=0.0001]\nSteps: 100%|█████████▉| 699/700 [05:03<00:00, 2.31it/s, loss=0.229, lr=0.0001]\nSteps: 100%|██████████| 700/700 [05:03<00:00, 2.31it/s, loss=0.229, lr=0.0001]\nSteps: 100%|██████████| 700/700 [05:03<00:00, 2.31it/s, loss=0.141, lr=0.0001]Model weights saved in /tmp/train/output/sd35_large_train_replicate/pytorch_lora_weights.safetensors\nLoading pipeline components...: 0%| | 0/9 [00:00<?, ?it/s]\u001b[A{'base_image_seq_len', 'base_shift', 'max_shift', 'max_image_seq_len', 'use_dynamic_shifting'} was not found in config. Values will be initialized to default values.\nLoaded scheduler as FlowMatchEulerDiscreteScheduler from `scheduler` subfolder of stable-diffusion-3.5-large.\nLoaded text_encoder as CLIPTextModelWithProjection from `text_encoder` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 22%|██▏ | 2/9 [00:00<00:01, 5.30it/s]\u001b[A\nLoading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]\u001b[A\u001b[A\nLoading checkpoint shards: 50%|█████ | 1/2 [00:04<00:04, 4.98s/it]\u001b[A\u001b[A\nLoading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.75s/it]\u001b[A\u001b[A\nLoading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.79s/it]\nLoaded text_encoder_3 as T5EncoderModel from `text_encoder_3` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 33%|███▎ | 3/9 [00:09<00:24, 4.12s/it]\u001b[A{'dual_attention_layers'} was not found in config. Values will be initialized to default values.\nLoaded transformer as SD3Transformer2DModel from `transformer` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 44%|████▍ | 4/9 [00:11<00:16, 3.27s/it]\u001b[ALoaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of stable-diffusion-3.5-large.\nLoaded tokenizer_3 as T5TokenizerFast from `tokenizer_3` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 67%|██████▋ | 6/9 [00:12<00:04, 1.64s/it]\u001b[ALoaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of stable-diffusion-3.5-large.\nLoaded text_encoder_2 as CLIPTextModelWithProjection from `text_encoder_2` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 89%|████████▉ | 8/9 [00:13<00:01, 1.30s/it]\u001b[ALoaded vae as AutoencoderKL from `vae` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 100%|██████████| 9/9 [00:13<00:00, 1.53s/it]\nSteps: 100%|██████████| 700/700 [05:18<00:00, 2.20it/s, loss=0.141, lr=0.0001]\n./\n./output/\n./output/sd35_large_train_replicate/\n./output/sd35_large_train_replicate/lora.safetensors", "metrics": { "predict_time": 381.84731858, "total_time": 448.071794 }, "output": "https://replicate.delivery/yhqm/jIEgASHbbbZOKdpHzeXSs69U7I7EuDmLQMvj0RDp1ONR8O1JA/trained_model.tar", "started_at": "2024-10-25T23:32:52.392475Z", "status": "succeeded", "urls": { "stream": "https://stream.replicate.com/v1/files/qoxq-f7ssj3usyyzpbiefbnvm5arzsinbfancnamm5xjc65jku7zrmela", "get": "https://api.replicate.com/v1/predictions/ng14j2cff1rj40cjrr2vbz667m", "cancel": "https://api.replicate.com/v1/predictions/ng14j2cff1rj40cjrr2vbz667m/cancel" }, "version": "cd6419a53b69fd410a912d945fa481a2a9ecfc4ab93062ed76c53f6e617f89e9" }
Generated inUsing seed: 3595070789 Extracted 16 files from zip to input_images Using params: ['accelerate', 'launch', '--dynamo_backend', 'no', 'train_dreambooth_lora_sd3.py', '--pretrained_model_name_or_path', 'stable-diffusion-3.5-large', '--instance_data_dir', 'input_images', '--rank', '16', '--output_dir', '/tmp/train/output/sd35_large_train_replicate', '--mixed_precision', 'bf16', '--instance_prompt', 'Frog, yarn art style', '--resolution', '768', '--train_batch_size', '1', '--gradient_accumulation_steps', '1', '--optimizer', 'AdamW', '--learning_rate', '0.0001', '--lr_scheduler', 'constant', '--lr_warmup_steps', '0', '--max_train_steps', '700', '--checkpointing_steps', '701', '--seed', '3595070789', '--logging_dir', '/tmp/logs'] 10/25/2024 23:33:02 - INFO - __main__ - Distributed environment: DistributedType.NO Num processes: 1 Process index: 0 Local process index: 0 Device: cuda Mixed precision type: bf16 You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors. You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors. You are using a model of type t5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors. {'base_image_seq_len', 'base_shift', 'max_shift', 'max_image_seq_len', 'use_dynamic_shifting'} was not found in config. Values will be initialized to default values. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|█████ | 1/2 [00:03<00:03, 3.67s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00, 3.64s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00, 3.64s/it] {'dual_attention_layers'} was not found in config. Values will be initialized to default values. 10/25/2024 23:33:53 - INFO - __main__ - ***** Running training ***** 10/25/2024 23:33:53 - INFO - __main__ - Num examples = 16 10/25/2024 23:33:53 - INFO - __main__ - Num batches each epoch = 16 10/25/2024 23:33:53 - INFO - __main__ - Num Epochs = 44 10/25/2024 23:33:53 - INFO - __main__ - Instantaneous batch size per device = 1 10/25/2024 23:33:53 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 1 10/25/2024 23:33:53 - INFO - __main__ - Gradient Accumulation steps = 1 10/25/2024 23:33:53 - INFO - __main__ - Total optimization steps = 700 Steps: 0%| | 0/700 [00:00<?, ?it/s] Steps: 0%| | 1/700 [00:00<07:26, 1.56it/s] Steps: 0%| | 1/700 [00:00<07:26, 1.56it/s, loss=0.132, lr=0.0001] Steps: 0%| | 2/700 [00:01<05:50, 1.99it/s, loss=0.132, lr=0.0001] Steps: 0%| | 2/700 [00:01<05:50, 1.99it/s, loss=0.189, lr=0.0001] Steps: 0%| | 3/700 [00:01<05:27, 2.13it/s, loss=0.189, lr=0.0001] Steps: 0%| | 3/700 [00:01<05:27, 2.13it/s, loss=0.0392, lr=0.0001] Steps: 1%| | 4/700 [00:01<05:17, 2.20it/s, loss=0.0392, lr=0.0001] Steps: 1%| | 4/700 [00:01<05:17, 2.20it/s, loss=0.203, lr=0.0001] Steps: 1%| | 5/700 [00:02<05:10, 2.24it/s, loss=0.203, lr=0.0001] Steps: 1%| | 5/700 [00:02<05:10, 2.24it/s, loss=0.165, lr=0.0001] Steps: 1%| | 6/700 [00:02<05:07, 2.26it/s, loss=0.165, lr=0.0001] Steps: 1%| | 6/700 [00:02<05:07, 2.26it/s, loss=0.175, lr=0.0001] Steps: 1%| | 7/700 [00:03<05:04, 2.27it/s, loss=0.175, lr=0.0001] Steps: 1%| | 7/700 [00:03<05:04, 2.27it/s, loss=0.171, lr=0.0001] Steps: 1%| | 8/700 [00:03<05:02, 2.28it/s, loss=0.171, lr=0.0001] Steps: 1%| | 8/700 [00:03<05:02, 2.28it/s, loss=0.141, lr=0.0001] Steps: 1%|▏ | 9/700 [00:04<05:01, 2.29it/s, loss=0.141, lr=0.0001] Steps: 1%|▏ | 9/700 [00:04<05:01, 2.29it/s, loss=0.203, lr=0.0001] Steps: 1%|▏ | 10/700 [00:04<05:00, 2.30it/s, loss=0.203, lr=0.0001] Steps: 1%|▏ | 10/700 [00:04<05:00, 2.30it/s, loss=0.0762, lr=0.0001] Steps: 2%|▏ | 11/700 [00:04<04:59, 2.30it/s, loss=0.0762, lr=0.0001] Steps: 2%|▏ | 11/700 [00:04<04:59, 2.30it/s, loss=0.0826, lr=0.0001] Steps: 2%|▏ | 12/700 [00:05<04:59, 2.30it/s, loss=0.0826, lr=0.0001] Steps: 2%|▏ | 12/700 [00:05<04:59, 2.30it/s, loss=0.19, lr=0.0001] Steps: 2%|▏ | 13/700 [00:05<04:59, 2.30it/s, loss=0.19, lr=0.0001] Steps: 2%|▏ | 13/700 [00:05<04:59, 2.30it/s, loss=0.285, lr=0.0001] Steps: 2%|▏ | 14/700 [00:06<04:58, 2.30it/s, loss=0.285, lr=0.0001] Steps: 2%|▏ | 14/700 [00:06<04:58, 2.30it/s, loss=0.144, lr=0.0001] Steps: 2%|▏ | 15/700 [00:06<04:57, 2.30it/s, loss=0.144, lr=0.0001] Steps: 2%|▏ | 15/700 [00:06<04:57, 2.30it/s, loss=0.134, lr=0.0001] Steps: 2%|▏ | 16/700 [00:07<04:56, 2.31it/s, loss=0.134, lr=0.0001] Steps: 2%|▏ | 16/700 [00:07<04:56, 2.31it/s, loss=0.189, lr=0.0001] Steps: 2%|▏ | 17/700 [00:07<04:57, 2.30it/s, loss=0.189, lr=0.0001] Steps: 2%|▏ | 17/700 [00:07<04:57, 2.30it/s, loss=0.097, lr=0.0001] Steps: 3%|▎ | 18/700 [00:07<04:56, 2.30it/s, loss=0.097, lr=0.0001] Steps: 3%|▎ | 18/700 [00:08<04:56, 2.30it/s, loss=0.215, lr=0.0001] Steps: 3%|▎ | 19/700 [00:08<04:55, 2.30it/s, loss=0.215, lr=0.0001] Steps: 3%|▎ | 19/700 [00:08<04:55, 2.30it/s, loss=0.173, lr=0.0001] Steps: 3%|▎ | 20/700 [00:08<04:55, 2.30it/s, loss=0.173, lr=0.0001] Steps: 3%|▎ | 20/700 [00:08<04:55, 2.30it/s, loss=0.0768, lr=0.0001] Steps: 3%|▎ | 21/700 [00:09<04:54, 2.30it/s, loss=0.0768, lr=0.0001] Steps: 3%|▎ | 21/700 [00:09<04:54, 2.30it/s, loss=0.0714, lr=0.0001] Steps: 3%|▎ | 22/700 [00:09<04:54, 2.30it/s, loss=0.0714, lr=0.0001] Steps: 3%|▎ | 22/700 [00:09<04:54, 2.30it/s, loss=0.148, lr=0.0001] Steps: 3%|▎ | 23/700 [00:10<04:54, 2.30it/s, loss=0.148, lr=0.0001] Steps: 3%|▎ | 23/700 [00:10<04:54, 2.30it/s, loss=0.297, lr=0.0001] Steps: 3%|▎ | 24/700 [00:10<04:53, 2.30it/s, loss=0.297, lr=0.0001] Steps: 3%|▎ | 24/700 [00:10<04:53, 2.30it/s, loss=0.0754, lr=0.0001] Steps: 4%|▎ | 25/700 [00:11<04:53, 2.30it/s, loss=0.0754, lr=0.0001] Steps: 4%|▎ | 25/700 [00:11<04:53, 2.30it/s, loss=0.116, lr=0.0001] Steps: 4%|▎ | 26/700 [00:11<04:52, 2.30it/s, loss=0.116, lr=0.0001] Steps: 4%|▎ | 26/700 [00:11<04:52, 2.30it/s, loss=0.0963, lr=0.0001] Steps: 4%|▍ | 27/700 [00:11<04:52, 2.30it/s, loss=0.0963, lr=0.0001] Steps: 4%|▍ | 27/700 [00:11<04:52, 2.30it/s, loss=0.0578, lr=0.0001] Steps: 4%|▍ | 28/700 [00:12<04:51, 2.30it/s, loss=0.0578, lr=0.0001] Steps: 4%|▍ | 28/700 [00:12<04:51, 2.30it/s, loss=0.0973, lr=0.0001] Steps: 4%|▍ | 29/700 [00:12<04:51, 2.30it/s, loss=0.0973, lr=0.0001] Steps: 4%|▍ | 29/700 [00:12<04:51, 2.30it/s, loss=0.116, lr=0.0001] Steps: 4%|▍ | 30/700 [00:13<04:51, 2.30it/s, loss=0.116, lr=0.0001] Steps: 4%|▍ | 30/700 [00:13<04:51, 2.30it/s, loss=0.191, lr=0.0001] Steps: 4%|▍ | 31/700 [00:13<04:50, 2.30it/s, loss=0.191, lr=0.0001] Steps: 4%|▍ | 31/700 [00:13<04:50, 2.30it/s, loss=0.113, lr=0.0001] Steps: 5%|▍ | 32/700 [00:14<04:49, 2.30it/s, loss=0.113, lr=0.0001] Steps: 5%|▍ | 32/700 [00:14<04:49, 2.30it/s, loss=0.187, lr=0.0001] Steps: 5%|▍ | 33/700 [00:14<04:50, 2.29it/s, loss=0.187, lr=0.0001] Steps: 5%|▍ | 33/700 [00:14<04:50, 2.29it/s, loss=0.104, lr=0.0001] Steps: 5%|▍ | 34/700 [00:14<04:50, 2.30it/s, loss=0.104, lr=0.0001] Steps: 5%|▍ | 34/700 [00:14<04:50, 2.30it/s, loss=0.176, lr=0.0001] Steps: 5%|▌ | 35/700 [00:15<04:49, 2.30it/s, loss=0.176, lr=0.0001] Steps: 5%|▌ | 35/700 [00:15<04:49, 2.30it/s, loss=0.0212, lr=0.0001] Steps: 5%|▌ | 36/700 [00:15<04:48, 2.30it/s, loss=0.0212, lr=0.0001] Steps: 5%|▌ | 36/700 [00:15<04:48, 2.30it/s, loss=0.0399, lr=0.0001] Steps: 5%|▌ | 37/700 [00:16<04:47, 2.30it/s, loss=0.0399, lr=0.0001] Steps: 5%|▌ | 37/700 [00:16<04:47, 2.30it/s, loss=0.078, lr=0.0001] Steps: 5%|▌ | 38/700 [00:16<04:47, 2.30it/s, loss=0.078, lr=0.0001] Steps: 5%|▌ | 38/700 [00:16<04:47, 2.30it/s, loss=0.208, lr=0.0001] Steps: 6%|▌ | 39/700 [00:17<04:46, 2.31it/s, loss=0.208, lr=0.0001] Steps: 6%|▌ | 39/700 [00:17<04:46, 2.31it/s, loss=0.212, lr=0.0001] Steps: 6%|▌ | 40/700 [00:17<04:46, 2.31it/s, loss=0.212, lr=0.0001] Steps: 6%|▌ | 40/700 [00:17<04:46, 2.31it/s, loss=0.119, lr=0.0001] Steps: 6%|▌ | 41/700 [00:17<04:45, 2.31it/s, loss=0.119, lr=0.0001] Steps: 6%|▌ | 41/700 [00:18<04:45, 2.31it/s, loss=0.186, lr=0.0001] Steps: 6%|▌ | 42/700 [00:18<04:45, 2.31it/s, loss=0.186, lr=0.0001] Steps: 6%|▌ | 42/700 [00:18<04:45, 2.31it/s, loss=0.0453, lr=0.0001] Steps: 6%|▌ | 43/700 [00:18<04:44, 2.31it/s, loss=0.0453, lr=0.0001] Steps: 6%|▌ | 43/700 [00:18<04:44, 2.31it/s, loss=0.125, lr=0.0001] Steps: 6%|▋ | 44/700 [00:19<04:44, 2.31it/s, loss=0.125, lr=0.0001] Steps: 6%|▋ | 44/700 [00:19<04:44, 2.31it/s, loss=0.299, lr=0.0001] Steps: 6%|▋ | 45/700 [00:19<04:43, 2.31it/s, loss=0.299, lr=0.0001] Steps: 6%|▋ | 45/700 [00:19<04:43, 2.31it/s, loss=0.0874, lr=0.0001] Steps: 7%|▋ | 46/700 [00:20<04:43, 2.31it/s, loss=0.0874, lr=0.0001] Steps: 7%|▋ | 46/700 [00:20<04:43, 2.31it/s, loss=0.178, lr=0.0001] Steps: 7%|▋ | 47/700 [00:20<04:43, 2.31it/s, loss=0.178, lr=0.0001] Steps: 7%|▋ | 47/700 [00:20<04:43, 2.31it/s, loss=0.166, lr=0.0001] Steps: 7%|▋ | 48/700 [00:21<04:42, 2.31it/s, loss=0.166, lr=0.0001] Steps: 7%|▋ | 48/700 [00:21<04:42, 2.31it/s, loss=0.0528, lr=0.0001] Steps: 7%|▋ | 49/700 [00:21<04:43, 2.30it/s, loss=0.0528, lr=0.0001] Steps: 7%|▋ | 49/700 [00:21<04:43, 2.30it/s, loss=0.159, lr=0.0001] Steps: 7%|▋ | 50/700 [00:21<04:42, 2.30it/s, loss=0.159, lr=0.0001] Steps: 7%|▋ | 50/700 [00:21<04:42, 2.30it/s, loss=0.103, lr=0.0001] Steps: 7%|▋ | 51/700 [00:22<04:41, 2.30it/s, loss=0.103, lr=0.0001] Steps: 7%|▋ | 51/700 [00:22<04:41, 2.30it/s, loss=0.034, lr=0.0001] Steps: 7%|▋ | 52/700 [00:22<04:41, 2.30it/s, loss=0.034, lr=0.0001] Steps: 7%|▋ | 52/700 [00:22<04:41, 2.30it/s, loss=0.0843, lr=0.0001] Steps: 8%|▊ | 53/700 [00:23<04:40, 2.31it/s, loss=0.0843, lr=0.0001] Steps: 8%|▊ | 53/700 [00:23<04:40, 2.31it/s, loss=0.163, lr=0.0001] Steps: 8%|▊ | 54/700 [00:23<04:40, 2.31it/s, loss=0.163, lr=0.0001] Steps: 8%|▊ | 54/700 [00:23<04:40, 2.31it/s, loss=0.202, lr=0.0001] Steps: 8%|▊ | 55/700 [00:24<04:40, 2.30it/s, loss=0.202, lr=0.0001] Steps: 8%|▊ | 55/700 [00:24<04:40, 2.30it/s, loss=0.178, lr=0.0001] Steps: 8%|▊ | 56/700 [00:24<04:39, 2.31it/s, loss=0.178, lr=0.0001] Steps: 8%|▊ | 56/700 [00:24<04:39, 2.31it/s, loss=0.215, lr=0.0001] Steps: 8%|▊ | 57/700 [00:24<04:38, 2.31it/s, loss=0.215, lr=0.0001] Steps: 8%|▊ | 57/700 [00:24<04:38, 2.31it/s, loss=0.0982, lr=0.0001] Steps: 8%|▊ | 58/700 [00:25<04:38, 2.31it/s, loss=0.0982, lr=0.0001] Steps: 8%|▊ | 58/700 [00:25<04:38, 2.31it/s, loss=0.143, lr=0.0001] Steps: 8%|▊ | 59/700 [00:25<04:37, 2.31it/s, loss=0.143, lr=0.0001] Steps: 8%|▊ | 59/700 [00:25<04:37, 2.31it/s, loss=0.156, lr=0.0001] Steps: 9%|▊ | 60/700 [00:26<04:37, 2.31it/s, loss=0.156, lr=0.0001] Steps: 9%|▊ | 60/700 [00:26<04:37, 2.31it/s, loss=0.117, lr=0.0001] Steps: 9%|▊ | 61/700 [00:26<04:36, 2.31it/s, loss=0.117, lr=0.0001] Steps: 9%|▊ | 61/700 [00:26<04:36, 2.31it/s, loss=0.168, lr=0.0001] Steps: 9%|▉ | 62/700 [00:27<04:36, 2.31it/s, loss=0.168, lr=0.0001] Steps: 9%|▉ | 62/700 [00:27<04:36, 2.31it/s, loss=0.098, lr=0.0001] Steps: 9%|▉ | 63/700 [00:27<04:36, 2.31it/s, loss=0.098, lr=0.0001] Steps: 9%|▉ | 63/700 [00:27<04:36, 2.31it/s, loss=0.16, lr=0.0001] Steps: 9%|▉ | 64/700 [00:27<04:35, 2.31it/s, loss=0.16, lr=0.0001] Steps: 9%|▉ | 64/700 [00:27<04:35, 2.31it/s, loss=0.0913, lr=0.0001] Steps: 9%|▉ | 65/700 [00:28<04:36, 2.30it/s, loss=0.0913, lr=0.0001] Steps: 9%|▉ | 65/700 [00:28<04:36, 2.30it/s, loss=0.232, lr=0.0001] Steps: 9%|▉ | 66/700 [00:28<04:36, 2.29it/s, loss=0.232, lr=0.0001] Steps: 9%|▉ | 66/700 [00:28<04:36, 2.29it/s, loss=0.204, lr=0.0001] Steps: 10%|▉ | 67/700 [00:29<04:35, 2.30it/s, loss=0.204, lr=0.0001] Steps: 10%|▉ | 67/700 [00:29<04:35, 2.30it/s, loss=0.0839, lr=0.0001] Steps: 10%|▉ | 68/700 [00:29<04:34, 2.30it/s, loss=0.0839, lr=0.0001] Steps: 10%|▉ | 68/700 [00:29<04:34, 2.30it/s, loss=0.163, lr=0.0001] Steps: 10%|▉ | 69/700 [00:30<04:33, 2.30it/s, loss=0.163, lr=0.0001] Steps: 10%|▉ | 69/700 [00:30<04:33, 2.30it/s, loss=0.117, lr=0.0001] Steps: 10%|█ | 70/700 [00:30<04:33, 2.31it/s, loss=0.117, lr=0.0001] Steps: 10%|█ | 70/700 [00:30<04:33, 2.31it/s, loss=0.116, lr=0.0001] Steps: 10%|█ | 71/700 [00:30<04:32, 2.31it/s, loss=0.116, lr=0.0001] Steps: 10%|█ | 71/700 [00:31<04:32, 2.31it/s, loss=0.273, lr=0.0001] Steps: 10%|█ | 72/700 [00:31<04:32, 2.31it/s, loss=0.273, lr=0.0001] Steps: 10%|█ | 72/700 [00:31<04:32, 2.31it/s, loss=0.2, lr=0.0001] Steps: 10%|█ | 73/700 [00:31<04:31, 2.31it/s, loss=0.2, lr=0.0001] Steps: 10%|█ | 73/700 [00:31<04:31, 2.31it/s, loss=0.189, lr=0.0001] Steps: 11%|█ | 74/700 [00:32<04:31, 2.31it/s, loss=0.189, lr=0.0001] Steps: 11%|█ | 74/700 [00:32<04:31, 2.31it/s, loss=0.201, lr=0.0001] Steps: 11%|█ | 75/700 [00:32<04:30, 2.31it/s, loss=0.201, lr=0.0001] Steps: 11%|█ | 75/700 [00:32<04:30, 2.31it/s, loss=0.13, lr=0.0001] Steps: 11%|█ | 76/700 [00:33<04:30, 2.31it/s, loss=0.13, lr=0.0001] Steps: 11%|█ | 76/700 [00:33<04:30, 2.31it/s, loss=0.128, lr=0.0001] Steps: 11%|█ | 77/700 [00:33<04:29, 2.31it/s, loss=0.128, lr=0.0001] Steps: 11%|█ | 77/700 [00:33<04:29, 2.31it/s, loss=0.19, lr=0.0001] Steps: 11%|█ | 78/700 [00:34<04:29, 2.31it/s, loss=0.19, lr=0.0001] Steps: 11%|█ | 78/700 [00:34<04:29, 2.31it/s, loss=0.117, lr=0.0001] Steps: 11%|█▏ | 79/700 [00:34<04:28, 2.31it/s, loss=0.117, lr=0.0001] Steps: 11%|█▏ | 79/700 [00:34<04:28, 2.31it/s, loss=0.0576, lr=0.0001] Steps: 11%|█▏ | 80/700 [00:34<04:28, 2.31it/s, loss=0.0576, lr=0.0001] Steps: 11%|█▏ | 80/700 [00:34<04:28, 2.31it/s, loss=0.0391, lr=0.0001] Steps: 12%|█▏ | 81/700 [00:35<04:29, 2.30it/s, loss=0.0391, lr=0.0001] Steps: 12%|█▏ | 81/700 [00:35<04:29, 2.30it/s, loss=0.157, lr=0.0001] Steps: 12%|█▏ | 82/700 [00:35<04:28, 2.30it/s, loss=0.157, lr=0.0001] Steps: 12%|█▏ | 82/700 [00:35<04:28, 2.30it/s, loss=0.0326, lr=0.0001] Steps: 12%|█▏ | 83/700 [00:36<04:27, 2.30it/s, loss=0.0326, lr=0.0001] Steps: 12%|█▏ | 83/700 [00:36<04:27, 2.30it/s, loss=0.0692, lr=0.0001] Steps: 12%|█▏ | 84/700 [00:36<04:27, 2.30it/s, loss=0.0692, lr=0.0001] Steps: 12%|█▏ | 84/700 [00:36<04:27, 2.30it/s, loss=0.175, lr=0.0001] Steps: 12%|█▏ | 85/700 [00:37<04:26, 2.31it/s, loss=0.175, lr=0.0001] Steps: 12%|█▏ | 85/700 [00:37<04:26, 2.31it/s, loss=0.134, lr=0.0001] Steps: 12%|█▏ | 86/700 [00:37<04:26, 2.31it/s, loss=0.134, lr=0.0001] Steps: 12%|█▏ | 86/700 [00:37<04:26, 2.31it/s, loss=0.137, lr=0.0001] Steps: 12%|█▏ | 87/700 [00:37<04:26, 2.30it/s, loss=0.137, lr=0.0001] Steps: 12%|█▏ | 87/700 [00:37<04:26, 2.30it/s, loss=0.0814, lr=0.0001] Steps: 13%|█▎ | 88/700 [00:38<04:25, 2.30it/s, loss=0.0814, lr=0.0001] Steps: 13%|█▎ | 88/700 [00:38<04:25, 2.30it/s, loss=0.29, lr=0.0001] Steps: 13%|█▎ | 89/700 [00:38<04:25, 2.31it/s, loss=0.29, lr=0.0001] Steps: 13%|█▎ | 89/700 [00:38<04:25, 2.31it/s, loss=0.122, lr=0.0001] Steps: 13%|█▎ | 90/700 [00:39<04:24, 2.31it/s, loss=0.122, lr=0.0001] Steps: 13%|█▎ | 90/700 [00:39<04:24, 2.31it/s, loss=0.0188, lr=0.0001] Steps: 13%|█▎ | 91/700 [00:39<04:24, 2.31it/s, loss=0.0188, lr=0.0001] Steps: 13%|█▎ | 91/700 [00:39<04:24, 2.31it/s, loss=0.146, lr=0.0001] Steps: 13%|█▎ | 92/700 [00:40<04:23, 2.31it/s, loss=0.146, lr=0.0001] Steps: 13%|█▎ | 92/700 [00:40<04:23, 2.31it/s, loss=0.0699, lr=0.0001] Steps: 13%|█▎ | 93/700 [00:40<04:22, 2.31it/s, loss=0.0699, lr=0.0001] Steps: 13%|█▎ | 93/700 [00:40<04:22, 2.31it/s, loss=0.0927, lr=0.0001] Steps: 13%|█▎ | 94/700 [00:40<04:22, 2.31it/s, loss=0.0927, lr=0.0001] Steps: 13%|█▎ | 94/700 [00:40<04:22, 2.31it/s, loss=0.147, lr=0.0001] Steps: 14%|█▎ | 95/700 [00:41<04:21, 2.31it/s, loss=0.147, lr=0.0001] Steps: 14%|█▎ | 95/700 [00:41<04:21, 2.31it/s, loss=0.0597, lr=0.0001] Steps: 14%|█▎ | 96/700 [00:41<04:21, 2.31it/s, loss=0.0597, lr=0.0001] Steps: 14%|█▎ | 96/700 [00:41<04:21, 2.31it/s, loss=0.107, lr=0.0001] Steps: 14%|█▍ | 97/700 [00:42<04:22, 2.30it/s, loss=0.107, lr=0.0001] Steps: 14%|█▍ | 97/700 [00:42<04:22, 2.30it/s, loss=0.103, lr=0.0001] Steps: 14%|█▍ | 98/700 [00:42<04:21, 2.30it/s, loss=0.103, lr=0.0001] Steps: 14%|█▍ | 98/700 [00:42<04:21, 2.30it/s, loss=0.127, lr=0.0001] Steps: 14%|█▍ | 99/700 [00:43<04:21, 2.30it/s, loss=0.127, lr=0.0001] Steps: 14%|█▍ | 99/700 [00:43<04:21, 2.30it/s, loss=0.0597, lr=0.0001] Steps: 14%|█▍ | 100/700 [00:43<04:21, 2.30it/s, loss=0.0597, lr=0.0001] Steps: 14%|█▍ | 100/700 [00:43<04:21, 2.30it/s, loss=0.0843, lr=0.0001] Steps: 14%|█▍ | 101/700 [00:44<04:20, 2.30it/s, loss=0.0843, lr=0.0001] Steps: 14%|█▍ | 101/700 [00:44<04:20, 2.30it/s, loss=0.0791, lr=0.0001] Steps: 15%|█▍ | 102/700 [00:44<04:19, 2.30it/s, loss=0.0791, lr=0.0001] Steps: 15%|█▍ | 102/700 [00:44<04:19, 2.30it/s, loss=0.0923, lr=0.0001] Steps: 15%|█▍ | 103/700 [00:44<04:19, 2.30it/s, loss=0.0923, lr=0.0001] Steps: 15%|█▍ | 103/700 [00:44<04:19, 2.30it/s, loss=0.159, lr=0.0001] Steps: 15%|█▍ | 104/700 [00:45<04:18, 2.30it/s, loss=0.159, lr=0.0001] Steps: 15%|█▍ | 104/700 [00:45<04:18, 2.30it/s, loss=0.304, lr=0.0001] Steps: 15%|█▌ | 105/700 [00:45<04:18, 2.30it/s, loss=0.304, lr=0.0001] Steps: 15%|█▌ | 105/700 [00:45<04:18, 2.30it/s, loss=0.0677, lr=0.0001] Steps: 15%|█▌ | 106/700 [00:46<04:17, 2.31it/s, loss=0.0677, lr=0.0001] Steps: 15%|█▌ | 106/700 [00:46<04:17, 2.31it/s, loss=0.102, lr=0.0001] Steps: 15%|█▌ | 107/700 [00:46<04:17, 2.31it/s, loss=0.102, lr=0.0001] Steps: 15%|█▌ | 107/700 [00:46<04:17, 2.31it/s, loss=0.129, lr=0.0001] Steps: 15%|█▌ | 108/700 [00:47<04:16, 2.31it/s, loss=0.129, lr=0.0001] Steps: 15%|█▌ | 108/700 [00:47<04:16, 2.31it/s, loss=0.131, lr=0.0001] Steps: 16%|█▌ | 109/700 [00:47<04:16, 2.31it/s, loss=0.131, lr=0.0001] Steps: 16%|█▌ | 109/700 [00:47<04:16, 2.31it/s, loss=0.0958, lr=0.0001] Steps: 16%|█▌ | 110/700 [00:47<04:15, 2.31it/s, loss=0.0958, lr=0.0001] Steps: 16%|█▌ | 110/700 [00:47<04:15, 2.31it/s, loss=0.244, lr=0.0001] Steps: 16%|█▌ | 111/700 [00:48<04:15, 2.31it/s, loss=0.244, lr=0.0001] Steps: 16%|█▌ | 111/700 [00:48<04:15, 2.31it/s, loss=0.278, lr=0.0001] Steps: 16%|█▌ | 112/700 [00:48<04:14, 2.31it/s, loss=0.278, lr=0.0001] Steps: 16%|█▌ | 112/700 [00:48<04:14, 2.31it/s, loss=0.1, lr=0.0001] Steps: 16%|█▌ | 113/700 [00:49<04:15, 2.30it/s, loss=0.1, lr=0.0001] Steps: 16%|█▌ | 113/700 [00:49<04:15, 2.30it/s, loss=0.133, lr=0.0001] Steps: 16%|█▋ | 114/700 [00:49<04:14, 2.30it/s, loss=0.133, lr=0.0001] Steps: 16%|█▋ | 114/700 [00:49<04:14, 2.30it/s, loss=0.253, lr=0.0001] Steps: 16%|█▋ | 115/700 [00:50<04:14, 2.30it/s, loss=0.253, lr=0.0001] Steps: 16%|█▋ | 115/700 [00:50<04:14, 2.30it/s, loss=0.114, lr=0.0001] Steps: 17%|█▋ | 116/700 [00:50<04:13, 2.30it/s, loss=0.114, lr=0.0001] Steps: 17%|█▋ | 116/700 [00:50<04:13, 2.30it/s, loss=0.154, lr=0.0001] Steps: 17%|█▋ | 117/700 [00:50<04:14, 2.29it/s, loss=0.154, lr=0.0001] Steps: 17%|█▋ | 117/700 [00:50<04:14, 2.29it/s, loss=0.202, lr=0.0001] Steps: 17%|█▋ | 118/700 [00:51<04:14, 2.29it/s, loss=0.202, lr=0.0001] Steps: 17%|█▋ | 118/700 [00:51<04:14, 2.29it/s, loss=0.0992, lr=0.0001] Steps: 17%|█▋ | 119/700 [00:51<04:13, 2.29it/s, loss=0.0992, lr=0.0001] Steps: 17%|█▋ | 119/700 [00:51<04:13, 2.29it/s, loss=0.166, lr=0.0001] Steps: 17%|█▋ | 120/700 [00:52<04:12, 2.30it/s, loss=0.166, lr=0.0001] Steps: 17%|█▋ | 120/700 [00:52<04:12, 2.30it/s, loss=0.124, lr=0.0001] Steps: 17%|█▋ | 121/700 [00:52<04:11, 2.30it/s, loss=0.124, lr=0.0001] Steps: 17%|█▋ | 121/700 [00:52<04:11, 2.30it/s, loss=0.0382, lr=0.0001] Steps: 17%|█▋ | 122/700 [00:53<04:11, 2.29it/s, loss=0.0382, lr=0.0001] Steps: 17%|█▋ | 122/700 [00:53<04:11, 2.29it/s, loss=0.0882, lr=0.0001] Steps: 18%|█▊ | 123/700 [00:53<04:11, 2.30it/s, loss=0.0882, lr=0.0001] Steps: 18%|█▊ | 123/700 [00:53<04:11, 2.30it/s, loss=0.0856, lr=0.0001] Steps: 18%|█▊ | 124/700 [00:54<04:10, 2.30it/s, loss=0.0856, lr=0.0001] Steps: 18%|█▊ | 124/700 [00:54<04:10, 2.30it/s, loss=0.145, lr=0.0001] Steps: 18%|█▊ | 125/700 [00:54<04:10, 2.29it/s, loss=0.145, lr=0.0001] Steps: 18%|█▊ | 125/700 [00:54<04:10, 2.29it/s, loss=0.14, lr=0.0001] Steps: 18%|█▊ | 126/700 [00:54<04:09, 2.30it/s, loss=0.14, lr=0.0001] Steps: 18%|█▊ | 126/700 [00:54<04:09, 2.30it/s, loss=0.194, lr=0.0001] Steps: 18%|█▊ | 127/700 [00:55<04:08, 2.31it/s, loss=0.194, lr=0.0001] Steps: 18%|█▊ | 127/700 [00:55<04:08, 2.31it/s, loss=0.101, lr=0.0001] Steps: 18%|█▊ | 128/700 [00:55<04:07, 2.31it/s, loss=0.101, lr=0.0001] Steps: 18%|█▊ | 128/700 [00:55<04:07, 2.31it/s, loss=0.106, lr=0.0001] Steps: 18%|█▊ | 129/700 [00:56<04:08, 2.30it/s, loss=0.106, lr=0.0001] Steps: 18%|█▊ | 129/700 [00:56<04:08, 2.30it/s, loss=0.138, lr=0.0001] Steps: 19%|█▊ | 130/700 [00:56<04:07, 2.30it/s, loss=0.138, lr=0.0001] Steps: 19%|█▊ | 130/700 [00:56<04:07, 2.30it/s, loss=0.229, lr=0.0001] Steps: 19%|█▊ | 131/700 [00:57<04:07, 2.30it/s, loss=0.229, lr=0.0001] Steps: 19%|█▊ | 131/700 [00:57<04:07, 2.30it/s, loss=0.125, lr=0.0001] Steps: 19%|█▉ | 132/700 [00:57<04:06, 2.30it/s, loss=0.125, lr=0.0001] Steps: 19%|█▉ | 132/700 [00:57<04:06, 2.30it/s, loss=0.251, lr=0.0001] Steps: 19%|█▉ | 133/700 [00:57<04:06, 2.30it/s, loss=0.251, lr=0.0001] Steps: 19%|█▉ | 133/700 [00:57<04:06, 2.30it/s, loss=0.111, lr=0.0001] Steps: 19%|█▉ | 134/700 [00:58<04:05, 2.30it/s, loss=0.111, lr=0.0001] Steps: 19%|█▉ | 134/700 [00:58<04:05, 2.30it/s, loss=0.0731, lr=0.0001] Steps: 19%|█▉ | 135/700 [00:58<04:05, 2.30it/s, loss=0.0731, lr=0.0001] Steps: 19%|█▉ | 135/700 [00:58<04:05, 2.30it/s, loss=0.146, lr=0.0001] Steps: 19%|█▉ | 136/700 [00:59<04:05, 2.30it/s, loss=0.146, lr=0.0001] Steps: 19%|█▉ | 136/700 [00:59<04:05, 2.30it/s, loss=0.0851, lr=0.0001] Steps: 20%|█▉ | 137/700 [00:59<04:04, 2.30it/s, loss=0.0851, lr=0.0001] Steps: 20%|█▉ | 137/700 [00:59<04:04, 2.30it/s, loss=0.245, lr=0.0001] Steps: 20%|█▉ | 138/700 [01:00<04:03, 2.31it/s, loss=0.245, lr=0.0001] Steps: 20%|█▉ | 138/700 [01:00<04:03, 2.31it/s, loss=0.113, lr=0.0001] Steps: 20%|█▉ | 139/700 [01:00<04:03, 2.30it/s, loss=0.113, lr=0.0001] Steps: 20%|█▉ | 139/700 [01:00<04:03, 2.30it/s, loss=0.158, lr=0.0001] Steps: 20%|██ | 140/700 [01:00<04:02, 2.31it/s, loss=0.158, lr=0.0001] Steps: 20%|██ | 140/700 [01:00<04:02, 2.31it/s, loss=0.0694, lr=0.0001] Steps: 20%|██ | 141/700 [01:01<04:02, 2.31it/s, loss=0.0694, lr=0.0001] Steps: 20%|██ | 141/700 [01:01<04:02, 2.31it/s, loss=0.0592, lr=0.0001] Steps: 20%|██ | 142/700 [01:01<04:02, 2.31it/s, loss=0.0592, lr=0.0001] Steps: 20%|██ | 142/700 [01:01<04:02, 2.31it/s, loss=0.0842, lr=0.0001] Steps: 20%|██ | 143/700 [01:02<04:01, 2.31it/s, loss=0.0842, lr=0.0001] Steps: 20%|██ | 143/700 [01:02<04:01, 2.31it/s, loss=0.286, lr=0.0001] Steps: 21%|██ | 144/700 [01:02<04:00, 2.31it/s, loss=0.286, lr=0.0001] Steps: 21%|██ | 144/700 [01:02<04:00, 2.31it/s, loss=0.153, lr=0.0001] Steps: 21%|██ | 145/700 [01:03<04:01, 2.30it/s, loss=0.153, lr=0.0001] Steps: 21%|██ | 145/700 [01:03<04:01, 2.30it/s, loss=0.128, lr=0.0001] Steps: 21%|██ | 146/700 [01:03<04:00, 2.30it/s, loss=0.128, lr=0.0001] Steps: 21%|██ | 146/700 [01:03<04:00, 2.30it/s, loss=0.135, lr=0.0001] Steps: 21%|██ | 147/700 [01:03<03:59, 2.30it/s, loss=0.135, lr=0.0001] Steps: 21%|██ | 147/700 [01:04<03:59, 2.30it/s, loss=0.133, lr=0.0001] Steps: 21%|██ | 148/700 [01:04<03:59, 2.31it/s, loss=0.133, lr=0.0001] Steps: 21%|██ | 148/700 [01:04<03:59, 2.31it/s, loss=0.139, lr=0.0001] Steps: 21%|██▏ | 149/700 [01:04<03:58, 2.31it/s, loss=0.139, lr=0.0001] Steps: 21%|██▏ | 149/700 [01:04<03:58, 2.31it/s, loss=0.0741, lr=0.0001] Steps: 21%|██▏ | 150/700 [01:05<03:58, 2.31it/s, loss=0.0741, lr=0.0001] Steps: 21%|██▏ | 150/700 [01:05<03:58, 2.31it/s, loss=0.26, lr=0.0001] Steps: 22%|██▏ | 151/700 [01:05<03:57, 2.31it/s, loss=0.26, lr=0.0001] Steps: 22%|██▏ | 151/700 [01:05<03:57, 2.31it/s, loss=0.14, lr=0.0001] Steps: 22%|██▏ | 152/700 [01:06<03:57, 2.31it/s, loss=0.14, lr=0.0001] Steps: 22%|██▏ | 152/700 [01:06<03:57, 2.31it/s, loss=0.118, lr=0.0001] Steps: 22%|██▏ | 153/700 [01:06<03:56, 2.31it/s, loss=0.118, lr=0.0001] Steps: 22%|██▏ | 153/700 [01:06<03:56, 2.31it/s, loss=0.119, lr=0.0001] Steps: 22%|██▏ | 154/700 [01:07<03:56, 2.31it/s, loss=0.119, lr=0.0001] Steps: 22%|██▏ | 154/700 [01:07<03:56, 2.31it/s, loss=0.0301, lr=0.0001] Steps: 22%|██▏ | 155/700 [01:07<03:55, 2.31it/s, loss=0.0301, lr=0.0001] Steps: 22%|██▏ | 155/700 [01:07<03:55, 2.31it/s, loss=0.147, lr=0.0001] Steps: 22%|██▏ | 156/700 [01:07<03:55, 2.31it/s, loss=0.147, lr=0.0001] Steps: 22%|██▏ | 156/700 [01:07<03:55, 2.31it/s, loss=0.246, lr=0.0001] Steps: 22%|██▏ | 157/700 [01:08<03:55, 2.31it/s, loss=0.246, lr=0.0001] Steps: 22%|██▏ | 157/700 [01:08<03:55, 2.31it/s, loss=0.281, lr=0.0001] Steps: 23%|██▎ | 158/700 [01:08<03:54, 2.31it/s, loss=0.281, lr=0.0001] Steps: 23%|██▎ | 158/700 [01:08<03:54, 2.31it/s, loss=0.114, lr=0.0001] Steps: 23%|██▎ | 159/700 [01:09<03:54, 2.31it/s, loss=0.114, lr=0.0001] Steps: 23%|██▎ | 159/700 [01:09<03:54, 2.31it/s, loss=0.0437, lr=0.0001] Steps: 23%|██▎ | 160/700 [01:09<03:53, 2.31it/s, loss=0.0437, lr=0.0001] Steps: 23%|██▎ | 160/700 [01:09<03:53, 2.31it/s, loss=0.0781, lr=0.0001] Steps: 23%|██▎ | 161/700 [01:10<03:54, 2.30it/s, loss=0.0781, lr=0.0001] Steps: 23%|██▎ | 161/700 [01:10<03:54, 2.30it/s, loss=0.0544, lr=0.0001] Steps: 23%|██▎ | 162/700 [01:10<03:53, 2.30it/s, loss=0.0544, lr=0.0001] Steps: 23%|██▎ | 162/700 [01:10<03:53, 2.30it/s, loss=0.199, lr=0.0001] Steps: 23%|██▎ | 163/700 [01:10<03:53, 2.30it/s, loss=0.199, lr=0.0001] Steps: 23%|██▎ | 163/700 [01:10<03:53, 2.30it/s, loss=0.164, lr=0.0001] Steps: 23%|██▎ | 164/700 [01:11<03:52, 2.31it/s, loss=0.164, lr=0.0001] Steps: 23%|██▎ | 164/700 [01:11<03:52, 2.31it/s, loss=0.0932, lr=0.0001] Steps: 24%|██▎ | 165/700 [01:11<03:51, 2.31it/s, loss=0.0932, lr=0.0001] Steps: 24%|██▎ | 165/700 [01:11<03:51, 2.31it/s, loss=0.116, lr=0.0001] Steps: 24%|██▎ | 166/700 [01:12<03:51, 2.31it/s, loss=0.116, lr=0.0001] Steps: 24%|██▎ | 166/700 [01:12<03:51, 2.31it/s, loss=0.0942, lr=0.0001] Steps: 24%|██▍ | 167/700 [01:12<03:50, 2.31it/s, loss=0.0942, lr=0.0001] Steps: 24%|██▍ | 167/700 [01:12<03:50, 2.31it/s, loss=0.105, lr=0.0001] Steps: 24%|██▍ | 168/700 [01:13<03:50, 2.31it/s, loss=0.105, lr=0.0001] Steps: 24%|██▍ | 168/700 [01:13<03:50, 2.31it/s, loss=0.141, lr=0.0001] Steps: 24%|██▍ | 169/700 [01:13<03:50, 2.31it/s, loss=0.141, lr=0.0001] Steps: 24%|██▍ | 169/700 [01:13<03:50, 2.31it/s, loss=0.146, lr=0.0001] Steps: 24%|██▍ | 170/700 [01:13<03:49, 2.31it/s, loss=0.146, lr=0.0001] Steps: 24%|██▍ | 170/700 [01:13<03:49, 2.31it/s, loss=0.0638, lr=0.0001] Steps: 24%|██▍ | 171/700 [01:14<03:49, 2.31it/s, loss=0.0638, lr=0.0001] Steps: 24%|██▍ | 171/700 [01:14<03:49, 2.31it/s, loss=0.16, lr=0.0001] Steps: 25%|██▍ | 172/700 [01:14<03:48, 2.31it/s, loss=0.16, lr=0.0001] Steps: 25%|██▍ | 172/700 [01:14<03:48, 2.31it/s, loss=0.215, lr=0.0001] Steps: 25%|██▍ | 173/700 [01:15<03:48, 2.31it/s, loss=0.215, lr=0.0001] Steps: 25%|██▍ | 173/700 [01:15<03:48, 2.31it/s, loss=0.21, lr=0.0001] Steps: 25%|██▍ | 174/700 [01:15<03:47, 2.31it/s, loss=0.21, lr=0.0001] Steps: 25%|██▍ | 174/700 [01:15<03:47, 2.31it/s, loss=0.174, lr=0.0001] Steps: 25%|██▌ | 175/700 [01:16<03:47, 2.31it/s, loss=0.174, lr=0.0001] Steps: 25%|██▌ | 175/700 [01:16<03:47, 2.31it/s, loss=0.117, lr=0.0001] Steps: 25%|██▌ | 176/700 [01:16<03:46, 2.31it/s, loss=0.117, lr=0.0001] Steps: 25%|██▌ | 176/700 [01:16<03:46, 2.31it/s, loss=0.169, lr=0.0001] Steps: 25%|██▌ | 177/700 [01:16<03:47, 2.30it/s, loss=0.169, lr=0.0001] Steps: 25%|██▌ | 177/700 [01:17<03:47, 2.30it/s, loss=0.0948, lr=0.0001] Steps: 25%|██▌ | 178/700 [01:17<03:46, 2.30it/s, loss=0.0948, lr=0.0001] Steps: 25%|██▌ | 178/700 [01:17<03:46, 2.30it/s, loss=0.275, lr=0.0001] Steps: 26%|██▌ | 179/700 [01:17<03:46, 2.30it/s, loss=0.275, lr=0.0001] Steps: 26%|██▌ | 179/700 [01:17<03:46, 2.30it/s, loss=0.109, lr=0.0001] Steps: 26%|██▌ | 180/700 [01:18<03:45, 2.31it/s, loss=0.109, lr=0.0001] Steps: 26%|██▌ | 180/700 [01:18<03:45, 2.31it/s, loss=0.0641, lr=0.0001] Steps: 26%|██▌ | 181/700 [01:18<03:45, 2.30it/s, loss=0.0641, lr=0.0001] Steps: 26%|██▌ | 181/700 [01:18<03:45, 2.30it/s, loss=0.245, lr=0.0001] Steps: 26%|██▌ | 182/700 [01:19<03:44, 2.31it/s, loss=0.245, lr=0.0001] Steps: 26%|██▌ | 182/700 [01:19<03:44, 2.31it/s, loss=0.133, lr=0.0001] Steps: 26%|██▌ | 183/700 [01:19<03:44, 2.31it/s, loss=0.133, lr=0.0001] Steps: 26%|██▌ | 183/700 [01:19<03:44, 2.31it/s, loss=0.0986, lr=0.0001] Steps: 26%|██▋ | 184/700 [01:20<03:43, 2.30it/s, loss=0.0986, lr=0.0001] Steps: 26%|██▋ | 184/700 [01:20<03:43, 2.30it/s, loss=0.152, lr=0.0001] Steps: 26%|██▋ | 185/700 [01:20<03:43, 2.31it/s, loss=0.152, lr=0.0001] Steps: 26%|██▋ | 185/700 [01:20<03:43, 2.31it/s, loss=0.136, lr=0.0001] Steps: 27%|██▋ | 186/700 [01:20<03:42, 2.31it/s, loss=0.136, lr=0.0001] Steps: 27%|██▋ | 186/700 [01:20<03:42, 2.31it/s, loss=0.172, lr=0.0001] Steps: 27%|██▋ | 187/700 [01:21<03:42, 2.31it/s, loss=0.172, lr=0.0001] Steps: 27%|██▋ | 187/700 [01:21<03:42, 2.31it/s, loss=0.31, lr=0.0001] Steps: 27%|██▋ | 188/700 [01:21<03:42, 2.30it/s, loss=0.31, lr=0.0001] Steps: 27%|██▋ | 188/700 [01:21<03:42, 2.30it/s, loss=0.124, lr=0.0001] Steps: 27%|██▋ | 189/700 [01:22<03:41, 2.30it/s, loss=0.124, lr=0.0001] Steps: 27%|██▋ | 189/700 [01:22<03:41, 2.30it/s, loss=0.049, lr=0.0001] Steps: 27%|██▋ | 190/700 [01:22<03:41, 2.30it/s, loss=0.049, lr=0.0001] Steps: 27%|██▋ | 190/700 [01:22<03:41, 2.30it/s, loss=0.0852, lr=0.0001] Steps: 27%|██▋ | 191/700 [01:23<03:41, 2.30it/s, loss=0.0852, lr=0.0001] Steps: 27%|██▋ | 191/700 [01:23<03:41, 2.30it/s, loss=0.0649, lr=0.0001] Steps: 27%|██▋ | 192/700 [01:23<03:40, 2.31it/s, loss=0.0649, lr=0.0001] Steps: 27%|██▋ | 192/700 [01:23<03:40, 2.31it/s, loss=0.0476, lr=0.0001] Steps: 28%|██▊ | 193/700 [01:23<03:41, 2.29it/s, loss=0.0476, lr=0.0001] Steps: 28%|██▊ | 193/700 [01:23<03:41, 2.29it/s, loss=0.0807, lr=0.0001] Steps: 28%|██▊ | 194/700 [01:24<03:40, 2.29it/s, loss=0.0807, lr=0.0001] Steps: 28%|██▊ | 194/700 [01:24<03:40, 2.29it/s, loss=0.207, lr=0.0001] Steps: 28%|██▊ | 195/700 [01:24<03:39, 2.30it/s, loss=0.207, lr=0.0001] Steps: 28%|██▊ | 195/700 [01:24<03:39, 2.30it/s, loss=0.153, lr=0.0001] Steps: 28%|██▊ | 196/700 [01:25<03:38, 2.30it/s, loss=0.153, lr=0.0001] Steps: 28%|██▊ | 196/700 [01:25<03:38, 2.30it/s, loss=0.0468, lr=0.0001] Steps: 28%|██▊ | 197/700 [01:25<03:38, 2.31it/s, loss=0.0468, lr=0.0001] Steps: 28%|██▊ | 197/700 [01:25<03:38, 2.31it/s, loss=0.194, lr=0.0001] Steps: 28%|██▊ | 198/700 [01:26<03:37, 2.31it/s, loss=0.194, lr=0.0001] Steps: 28%|██▊ | 198/700 [01:26<03:37, 2.31it/s, loss=0.341, lr=0.0001] Steps: 28%|██▊ | 199/700 [01:26<03:37, 2.31it/s, loss=0.341, lr=0.0001] Steps: 28%|██▊ | 199/700 [01:26<03:37, 2.31it/s, loss=0.0981, lr=0.0001] Steps: 29%|██▊ | 200/700 [01:26<03:36, 2.31it/s, loss=0.0981, lr=0.0001] Steps: 29%|██▊ | 200/700 [01:27<03:36, 2.31it/s, loss=0.193, lr=0.0001] Steps: 29%|██▊ | 201/700 [01:27<03:36, 2.30it/s, loss=0.193, lr=0.0001] Steps: 29%|██▊ | 201/700 [01:27<03:36, 2.30it/s, loss=0.0917, lr=0.0001] Steps: 29%|██▉ | 202/700 [01:27<03:35, 2.31it/s, loss=0.0917, lr=0.0001] Steps: 29%|██▉ | 202/700 [01:27<03:35, 2.31it/s, loss=0.149, lr=0.0001] Steps: 29%|██▉ | 203/700 [01:28<03:35, 2.31it/s, loss=0.149, lr=0.0001] Steps: 29%|██▉ | 203/700 [01:28<03:35, 2.31it/s, loss=0.0842, lr=0.0001] Steps: 29%|██▉ | 204/700 [01:28<03:34, 2.31it/s, loss=0.0842, lr=0.0001] Steps: 29%|██▉ | 204/700 [01:28<03:34, 2.31it/s, loss=0.27, lr=0.0001] Steps: 29%|██▉ | 205/700 [01:29<03:34, 2.31it/s, loss=0.27, lr=0.0001] Steps: 29%|██▉ | 205/700 [01:29<03:34, 2.31it/s, loss=0.234, lr=0.0001] Steps: 29%|██▉ | 206/700 [01:29<03:34, 2.31it/s, loss=0.234, lr=0.0001] Steps: 29%|██▉ | 206/700 [01:29<03:34, 2.31it/s, loss=0.125, lr=0.0001] Steps: 30%|██▉ | 207/700 [01:30<03:33, 2.31it/s, loss=0.125, lr=0.0001] Steps: 30%|██▉ | 207/700 [01:30<03:33, 2.31it/s, loss=0.0958, lr=0.0001] Steps: 30%|██▉ | 208/700 [01:30<03:33, 2.31it/s, loss=0.0958, lr=0.0001] Steps: 30%|██▉ | 208/700 [01:30<03:33, 2.31it/s, loss=0.0906, lr=0.0001] Steps: 30%|██▉ | 209/700 [01:30<03:33, 2.30it/s, loss=0.0906, lr=0.0001] Steps: 30%|██▉ | 209/700 [01:30<03:33, 2.30it/s, loss=0.0941, lr=0.0001] Steps: 30%|███ | 210/700 [01:31<03:32, 2.30it/s, loss=0.0941, lr=0.0001] Steps: 30%|███ | 210/700 [01:31<03:32, 2.30it/s, loss=0.0909, lr=0.0001] Steps: 30%|███ | 211/700 [01:31<03:32, 2.30it/s, loss=0.0909, lr=0.0001] Steps: 30%|███ | 211/700 [01:31<03:32, 2.30it/s, loss=0.126, lr=0.0001] Steps: 30%|███ | 212/700 [01:32<03:31, 2.30it/s, loss=0.126, lr=0.0001] Steps: 30%|███ | 212/700 [01:32<03:31, 2.30it/s, loss=0.148, lr=0.0001] Steps: 30%|███ | 213/700 [01:32<03:31, 2.31it/s, loss=0.148, lr=0.0001] Steps: 30%|███ | 213/700 [01:32<03:31, 2.31it/s, loss=0.259, lr=0.0001] Steps: 31%|███ | 214/700 [01:33<03:30, 2.31it/s, loss=0.259, lr=0.0001] Steps: 31%|███ | 214/700 [01:33<03:30, 2.31it/s, loss=0.233, lr=0.0001] Steps: 31%|███ | 215/700 [01:33<03:30, 2.31it/s, loss=0.233, lr=0.0001] Steps: 31%|███ | 215/700 [01:33<03:30, 2.31it/s, loss=0.0979, lr=0.0001] Steps: 31%|███ | 216/700 [01:33<03:29, 2.31it/s, loss=0.0979, lr=0.0001] Steps: 31%|███ | 216/700 [01:33<03:29, 2.31it/s, loss=0.167, lr=0.0001] Steps: 31%|███ | 217/700 [01:34<03:29, 2.31it/s, loss=0.167, lr=0.0001] Steps: 31%|███ | 217/700 [01:34<03:29, 2.31it/s, loss=0.136, lr=0.0001] Steps: 31%|███ | 218/700 [01:34<03:28, 2.31it/s, loss=0.136, lr=0.0001] Steps: 31%|███ | 218/700 [01:34<03:28, 2.31it/s, loss=0.112, lr=0.0001] Steps: 31%|███▏ | 219/700 [01:35<03:28, 2.31it/s, loss=0.112, lr=0.0001] Steps: 31%|███▏ | 219/700 [01:35<03:28, 2.31it/s, loss=0.0973, lr=0.0001] Steps: 31%|███▏ | 220/700 [01:35<03:27, 2.31it/s, loss=0.0973, lr=0.0001] Steps: 31%|███▏ | 220/700 [01:35<03:27, 2.31it/s, loss=0.113, lr=0.0001] Steps: 32%|███▏ | 221/700 [01:36<03:27, 2.31it/s, loss=0.113, lr=0.0001] Steps: 32%|███▏ | 221/700 [01:36<03:27, 2.31it/s, loss=0.094, lr=0.0001] Steps: 32%|███▏ | 222/700 [01:36<03:26, 2.31it/s, loss=0.094, lr=0.0001] Steps: 32%|███▏ | 222/700 [01:36<03:26, 2.31it/s, loss=0.141, lr=0.0001] Steps: 32%|███▏ | 223/700 [01:36<03:26, 2.31it/s, loss=0.141, lr=0.0001] Steps: 32%|███▏ | 223/700 [01:36<03:26, 2.31it/s, loss=0.148, lr=0.0001] Steps: 32%|███▏ | 224/700 [01:37<03:25, 2.31it/s, loss=0.148, lr=0.0001] Steps: 32%|███▏ | 224/700 [01:37<03:25, 2.31it/s, loss=0.105, lr=0.0001] Steps: 32%|███▏ | 225/700 [01:37<03:26, 2.30it/s, loss=0.105, lr=0.0001] Steps: 32%|███▏ | 225/700 [01:37<03:26, 2.30it/s, loss=0.255, lr=0.0001] Steps: 32%|███▏ | 226/700 [01:38<03:25, 2.30it/s, loss=0.255, lr=0.0001] Steps: 32%|███▏ | 226/700 [01:38<03:25, 2.30it/s, loss=0.189, lr=0.0001] Steps: 32%|███▏ | 227/700 [01:38<03:25, 2.30it/s, loss=0.189, lr=0.0001] Steps: 32%|███▏ | 227/700 [01:38<03:25, 2.30it/s, loss=0.117, lr=0.0001] Steps: 33%|███▎ | 228/700 [01:39<03:24, 2.31it/s, loss=0.117, lr=0.0001] Steps: 33%|███▎ | 228/700 [01:39<03:24, 2.31it/s, loss=0.0894, lr=0.0001] Steps: 33%|███▎ | 229/700 [01:39<03:24, 2.31it/s, loss=0.0894, lr=0.0001] Steps: 33%|███▎ | 229/700 [01:39<03:24, 2.31it/s, loss=0.107, lr=0.0001] Steps: 33%|███▎ | 230/700 [01:39<03:23, 2.31it/s, loss=0.107, lr=0.0001] Steps: 33%|███▎ | 230/700 [01:40<03:23, 2.31it/s, loss=0.0873, lr=0.0001] Steps: 33%|███▎ | 231/700 [01:40<03:23, 2.31it/s, loss=0.0873, lr=0.0001] Steps: 33%|███▎ | 231/700 [01:40<03:23, 2.31it/s, loss=0.0671, lr=0.0001] Steps: 33%|███▎ | 232/700 [01:40<03:22, 2.31it/s, loss=0.0671, lr=0.0001] Steps: 33%|███▎ | 232/700 [01:40<03:22, 2.31it/s, loss=0.094, lr=0.0001] Steps: 33%|███▎ | 233/700 [01:41<03:22, 2.31it/s, loss=0.094, lr=0.0001] Steps: 33%|███▎ | 233/700 [01:41<03:22, 2.31it/s, loss=0.124, lr=0.0001] Steps: 33%|███▎ | 234/700 [01:41<03:21, 2.31it/s, loss=0.124, lr=0.0001] Steps: 33%|███▎ | 234/700 [01:41<03:21, 2.31it/s, loss=0.0847, lr=0.0001] Steps: 34%|███▎ | 235/700 [01:42<03:21, 2.31it/s, loss=0.0847, lr=0.0001] Steps: 34%|███▎ | 235/700 [01:42<03:21, 2.31it/s, loss=0.236, lr=0.0001] Steps: 34%|███▎ | 236/700 [01:42<03:20, 2.31it/s, loss=0.236, lr=0.0001] Steps: 34%|███▎ | 236/700 [01:42<03:20, 2.31it/s, loss=0.0215, lr=0.0001] Steps: 34%|███▍ | 237/700 [01:43<03:20, 2.31it/s, loss=0.0215, lr=0.0001] Steps: 34%|███▍ | 237/700 [01:43<03:20, 2.31it/s, loss=0.0918, lr=0.0001] Steps: 34%|███▍ | 238/700 [01:43<03:19, 2.31it/s, loss=0.0918, lr=0.0001] Steps: 34%|███▍ | 238/700 [01:43<03:19, 2.31it/s, loss=0.152, lr=0.0001] Steps: 34%|███▍ | 239/700 [01:43<03:19, 2.31it/s, loss=0.152, lr=0.0001] Steps: 34%|███▍ | 239/700 [01:43<03:19, 2.31it/s, loss=0.0908, lr=0.0001] Steps: 34%|███▍ | 240/700 [01:44<03:18, 2.31it/s, loss=0.0908, lr=0.0001] Steps: 34%|███▍ | 240/700 [01:44<03:18, 2.31it/s, loss=0.0664, lr=0.0001] Steps: 34%|███▍ | 241/700 [01:44<03:19, 2.30it/s, loss=0.0664, lr=0.0001] Steps: 34%|███▍ | 241/700 [01:44<03:19, 2.30it/s, loss=0.0761, lr=0.0001] Steps: 35%|███▍ | 242/700 [01:45<03:18, 2.30it/s, loss=0.0761, lr=0.0001] Steps: 35%|███▍ | 242/700 [01:45<03:18, 2.30it/s, loss=0.0773, lr=0.0001] Steps: 35%|███▍ | 243/700 [01:45<03:18, 2.31it/s, loss=0.0773, lr=0.0001] Steps: 35%|███▍ | 243/700 [01:45<03:18, 2.31it/s, loss=0.127, lr=0.0001] Steps: 35%|███▍ | 244/700 [01:46<03:17, 2.31it/s, loss=0.127, lr=0.0001] Steps: 35%|███▍ | 244/700 [01:46<03:17, 2.31it/s, loss=0.16, lr=0.0001] Steps: 35%|███▌ | 245/700 [01:46<03:17, 2.31it/s, loss=0.16, lr=0.0001] Steps: 35%|███▌ | 245/700 [01:46<03:17, 2.31it/s, loss=0.0749, lr=0.0001] Steps: 35%|███▌ | 246/700 [01:46<03:16, 2.31it/s, loss=0.0749, lr=0.0001] Steps: 35%|███▌ | 246/700 [01:46<03:16, 2.31it/s, loss=0.143, lr=0.0001] Steps: 35%|███▌ | 247/700 [01:47<03:16, 2.31it/s, loss=0.143, lr=0.0001] Steps: 35%|███▌ | 247/700 [01:47<03:16, 2.31it/s, loss=0.221, lr=0.0001] Steps: 35%|███▌ | 248/700 [01:47<03:15, 2.31it/s, loss=0.221, lr=0.0001] Steps: 35%|███▌ | 248/700 [01:47<03:15, 2.31it/s, loss=0.0879, lr=0.0001] Steps: 36%|███▌ | 249/700 [01:48<03:15, 2.31it/s, loss=0.0879, lr=0.0001] Steps: 36%|███▌ | 249/700 [01:48<03:15, 2.31it/s, loss=0.0838, lr=0.0001] Steps: 36%|███▌ | 250/700 [01:48<03:14, 2.31it/s, loss=0.0838, lr=0.0001] Steps: 36%|███▌ | 250/700 [01:48<03:14, 2.31it/s, loss=0.166, lr=0.0001] Steps: 36%|███▌ | 251/700 [01:49<03:14, 2.31it/s, loss=0.166, lr=0.0001] Steps: 36%|███▌ | 251/700 [01:49<03:14, 2.31it/s, loss=0.156, lr=0.0001] Steps: 36%|███▌ | 252/700 [01:49<03:13, 2.31it/s, loss=0.156, lr=0.0001] Steps: 36%|███▌ | 252/700 [01:49<03:13, 2.31it/s, loss=0.256, lr=0.0001] Steps: 36%|███▌ | 253/700 [01:49<03:13, 2.31it/s, loss=0.256, lr=0.0001] Steps: 36%|███▌ | 253/700 [01:49<03:13, 2.31it/s, loss=0.044, lr=0.0001] Steps: 36%|███▋ | 254/700 [01:50<03:12, 2.31it/s, loss=0.044, lr=0.0001] Steps: 36%|███▋ | 254/700 [01:50<03:12, 2.31it/s, loss=0.182, lr=0.0001] Steps: 36%|███▋ | 255/700 [01:50<03:12, 2.31it/s, loss=0.182, lr=0.0001] Steps: 36%|███▋ | 255/700 [01:50<03:12, 2.31it/s, loss=0.102, lr=0.0001] Steps: 37%|███▋ | 256/700 [01:51<03:12, 2.31it/s, loss=0.102, lr=0.0001] Steps: 37%|███▋ | 256/700 [01:51<03:12, 2.31it/s, loss=0.151, lr=0.0001] Steps: 37%|███▋ | 257/700 [01:51<03:12, 2.30it/s, loss=0.151, lr=0.0001] Steps: 37%|███▋ | 257/700 [01:51<03:12, 2.30it/s, loss=0.0976, lr=0.0001] Steps: 37%|███▋ | 258/700 [01:52<03:11, 2.30it/s, loss=0.0976, lr=0.0001] Steps: 37%|███▋ | 258/700 [01:52<03:11, 2.30it/s, loss=0.193, lr=0.0001] Steps: 37%|███▋ | 259/700 [01:52<03:11, 2.31it/s, loss=0.193, lr=0.0001] Steps: 37%|███▋ | 259/700 [01:52<03:11, 2.31it/s, loss=0.0853, lr=0.0001] Steps: 37%|███▋ | 260/700 [01:52<03:10, 2.31it/s, loss=0.0853, lr=0.0001] Steps: 37%|███▋ | 260/700 [01:53<03:10, 2.31it/s, loss=0.201, lr=0.0001] Steps: 37%|███▋ | 261/700 [01:53<03:10, 2.31it/s, loss=0.201, lr=0.0001] Steps: 37%|███▋ | 261/700 [01:53<03:10, 2.31it/s, loss=0.191, lr=0.0001] Steps: 37%|███▋ | 262/700 [01:53<03:09, 2.31it/s, loss=0.191, lr=0.0001] Steps: 37%|███▋ | 262/700 [01:53<03:09, 2.31it/s, loss=0.0494, lr=0.0001] Steps: 38%|███▊ | 263/700 [01:54<03:09, 2.31it/s, loss=0.0494, lr=0.0001] Steps: 38%|███▊ | 263/700 [01:54<03:09, 2.31it/s, loss=0.0995, lr=0.0001] Steps: 38%|███▊ | 264/700 [01:54<03:08, 2.31it/s, loss=0.0995, lr=0.0001] Steps: 38%|███▊ | 264/700 [01:54<03:08, 2.31it/s, loss=0.204, lr=0.0001] Steps: 38%|███▊ | 265/700 [01:55<03:08, 2.31it/s, loss=0.204, lr=0.0001] Steps: 38%|███▊ | 265/700 [01:55<03:08, 2.31it/s, loss=0.18, lr=0.0001] Steps: 38%|███▊ | 266/700 [01:55<03:07, 2.31it/s, loss=0.18, lr=0.0001] Steps: 38%|███▊ | 266/700 [01:55<03:07, 2.31it/s, loss=0.107, lr=0.0001] Steps: 38%|███▊ | 267/700 [01:56<03:07, 2.31it/s, loss=0.107, lr=0.0001] Steps: 38%|███▊ | 267/700 [01:56<03:07, 2.31it/s, loss=0.243, lr=0.0001] Steps: 38%|███▊ | 268/700 [01:56<03:06, 2.31it/s, loss=0.243, lr=0.0001] Steps: 38%|███▊ | 268/700 [01:56<03:06, 2.31it/s, loss=0.0764, lr=0.0001] Steps: 38%|███▊ | 269/700 [01:56<03:06, 2.31it/s, loss=0.0764, lr=0.0001] Steps: 38%|███▊ | 269/700 [01:56<03:06, 2.31it/s, loss=0.103, lr=0.0001] Steps: 39%|███▊ | 270/700 [01:57<03:06, 2.31it/s, loss=0.103, lr=0.0001] Steps: 39%|███▊ | 270/700 [01:57<03:06, 2.31it/s, loss=0.114, lr=0.0001] Steps: 39%|███▊ | 271/700 [01:57<03:05, 2.31it/s, loss=0.114, lr=0.0001] Steps: 39%|███▊ | 271/700 [01:57<03:05, 2.31it/s, loss=0.206, lr=0.0001] Steps: 39%|███▉ | 272/700 [01:58<03:05, 2.31it/s, loss=0.206, lr=0.0001] Steps: 39%|███▉ | 272/700 [01:58<03:05, 2.31it/s, loss=0.108, lr=0.0001] Steps: 39%|███▉ | 273/700 [01:58<03:05, 2.30it/s, loss=0.108, lr=0.0001] Steps: 39%|███▉ | 273/700 [01:58<03:05, 2.30it/s, loss=0.14, lr=0.0001] Steps: 39%|███▉ | 274/700 [01:59<03:04, 2.30it/s, loss=0.14, lr=0.0001] Steps: 39%|███▉ | 274/700 [01:59<03:04, 2.30it/s, loss=0.0251, lr=0.0001] Steps: 39%|███▉ | 275/700 [01:59<03:04, 2.31it/s, loss=0.0251, lr=0.0001] Steps: 39%|███▉ | 275/700 [01:59<03:04, 2.31it/s, loss=0.151, lr=0.0001] Steps: 39%|███▉ | 276/700 [01:59<03:03, 2.31it/s, loss=0.151, lr=0.0001] Steps: 39%|███▉ | 276/700 [01:59<03:03, 2.31it/s, loss=0.128, lr=0.0001] Steps: 40%|███▉ | 277/700 [02:00<03:03, 2.31it/s, loss=0.128, lr=0.0001] Steps: 40%|███▉ | 277/700 [02:00<03:03, 2.31it/s, loss=0.097, lr=0.0001] Steps: 40%|███▉ | 278/700 [02:00<03:02, 2.31it/s, loss=0.097, lr=0.0001] Steps: 40%|███▉ | 278/700 [02:00<03:02, 2.31it/s, loss=0.293, lr=0.0001] Steps: 40%|███▉ | 279/700 [02:01<03:02, 2.31it/s, loss=0.293, lr=0.0001] Steps: 40%|███▉ | 279/700 [02:01<03:02, 2.31it/s, loss=0.286, lr=0.0001] Steps: 40%|████ | 280/700 [02:01<03:01, 2.31it/s, loss=0.286, lr=0.0001] Steps: 40%|████ | 280/700 [02:01<03:01, 2.31it/s, loss=0.171, lr=0.0001] Steps: 40%|████ | 281/700 [02:02<03:01, 2.31it/s, loss=0.171, lr=0.0001] Steps: 40%|████ | 281/700 [02:02<03:01, 2.31it/s, loss=0.2, lr=0.0001] Steps: 40%|████ | 282/700 [02:02<03:00, 2.31it/s, loss=0.2, lr=0.0001] Steps: 40%|████ | 282/700 [02:02<03:00, 2.31it/s, loss=0.153, lr=0.0001] Steps: 40%|████ | 283/700 [02:02<03:00, 2.31it/s, loss=0.153, lr=0.0001] Steps: 40%|████ | 283/700 [02:02<03:00, 2.31it/s, loss=0.132, lr=0.0001] Steps: 41%|████ | 284/700 [02:03<02:59, 2.31it/s, loss=0.132, lr=0.0001] Steps: 41%|████ | 284/700 [02:03<02:59, 2.31it/s, loss=0.115, lr=0.0001] Steps: 41%|████ | 285/700 [02:03<02:59, 2.31it/s, loss=0.115, lr=0.0001] Steps: 41%|████ | 285/700 [02:03<02:59, 2.31it/s, loss=0.159, lr=0.0001] Steps: 41%|████ | 286/700 [02:04<02:59, 2.31it/s, loss=0.159, lr=0.0001] Steps: 41%|████ | 286/700 [02:04<02:59, 2.31it/s, loss=0.0701, lr=0.0001] Steps: 41%|████ | 287/700 [02:04<02:58, 2.31it/s, loss=0.0701, lr=0.0001] Steps: 41%|████ | 287/700 [02:04<02:58, 2.31it/s, loss=0.134, lr=0.0001] Steps: 41%|████ | 288/700 [02:05<02:58, 2.31it/s, loss=0.134, lr=0.0001] Steps: 41%|████ | 288/700 [02:05<02:58, 2.31it/s, loss=0.188, lr=0.0001] Steps: 41%|████▏ | 289/700 [02:05<02:58, 2.30it/s, loss=0.188, lr=0.0001] Steps: 41%|████▏ | 289/700 [02:05<02:58, 2.30it/s, loss=0.0311, lr=0.0001] Steps: 41%|████▏ | 290/700 [02:05<02:58, 2.30it/s, loss=0.0311, lr=0.0001] Steps: 41%|████▏ | 290/700 [02:05<02:58, 2.30it/s, loss=0.13, lr=0.0001] Steps: 42%|████▏ | 291/700 [02:06<02:57, 2.30it/s, loss=0.13, lr=0.0001] Steps: 42%|████▏ | 291/700 [02:06<02:57, 2.30it/s, loss=0.286, lr=0.0001] Steps: 42%|████▏ | 292/700 [02:06<02:57, 2.30it/s, loss=0.286, lr=0.0001] Steps: 42%|████▏ | 292/700 [02:06<02:57, 2.30it/s, loss=0.136, lr=0.0001] Steps: 42%|████▏ | 293/700 [02:07<02:56, 2.31it/s, loss=0.136, lr=0.0001] Steps: 42%|████▏ | 293/700 [02:07<02:56, 2.31it/s, loss=0.0702, lr=0.0001] Steps: 42%|████▏ | 294/700 [02:07<02:55, 2.31it/s, loss=0.0702, lr=0.0001] Steps: 42%|████▏ | 294/700 [02:07<02:55, 2.31it/s, loss=0.161, lr=0.0001] Steps: 42%|████▏ | 295/700 [02:08<02:55, 2.31it/s, loss=0.161, lr=0.0001] Steps: 42%|████▏ | 295/700 [02:08<02:55, 2.31it/s, loss=0.0911, lr=0.0001] Steps: 42%|████▏ | 296/700 [02:08<02:54, 2.31it/s, loss=0.0911, lr=0.0001] Steps: 42%|████▏ | 296/700 [02:08<02:54, 2.31it/s, loss=0.074, lr=0.0001] Steps: 42%|████▏ | 297/700 [02:08<02:54, 2.31it/s, loss=0.074, lr=0.0001] Steps: 42%|████▏ | 297/700 [02:09<02:54, 2.31it/s, loss=0.112, lr=0.0001] Steps: 43%|████▎ | 298/700 [02:09<02:54, 2.31it/s, loss=0.112, lr=0.0001] Steps: 43%|████▎ | 298/700 [02:09<02:54, 2.31it/s, loss=0.0824, lr=0.0001] Steps: 43%|████▎ | 299/700 [02:09<02:53, 2.31it/s, loss=0.0824, lr=0.0001] Steps: 43%|████▎ | 299/700 [02:09<02:53, 2.31it/s, loss=0.124, lr=0.0001] Steps: 43%|████▎ | 300/700 [02:10<02:53, 2.31it/s, loss=0.124, lr=0.0001] Steps: 43%|████▎ | 300/700 [02:10<02:53, 2.31it/s, loss=0.129, lr=0.0001] Steps: 43%|████▎ | 301/700 [02:10<02:53, 2.31it/s, loss=0.129, lr=0.0001] Steps: 43%|████▎ | 301/700 [02:10<02:53, 2.31it/s, loss=0.148, lr=0.0001] Steps: 43%|████▎ | 302/700 [02:11<02:52, 2.31it/s, loss=0.148, lr=0.0001] Steps: 43%|████▎ | 302/700 [02:11<02:52, 2.31it/s, loss=0.0999, lr=0.0001] Steps: 43%|████▎ | 303/700 [02:11<02:51, 2.31it/s, loss=0.0999, lr=0.0001] Steps: 43%|████▎ | 303/700 [02:11<02:51, 2.31it/s, loss=0.0991, lr=0.0001] Steps: 43%|████▎ | 304/700 [02:12<02:51, 2.31it/s, loss=0.0991, lr=0.0001] Steps: 43%|████▎ | 304/700 [02:12<02:51, 2.31it/s, loss=0.206, lr=0.0001] Steps: 44%|████▎ | 305/700 [02:12<02:51, 2.30it/s, loss=0.206, lr=0.0001] Steps: 44%|████▎ | 305/700 [02:12<02:51, 2.30it/s, loss=0.0953, lr=0.0001] Steps: 44%|████▎ | 306/700 [02:12<02:51, 2.30it/s, loss=0.0953, lr=0.0001] Steps: 44%|████▎ | 306/700 [02:12<02:51, 2.30it/s, loss=0.132, lr=0.0001] Steps: 44%|████▍ | 307/700 [02:13<02:50, 2.31it/s, loss=0.132, lr=0.0001] Steps: 44%|████▍ | 307/700 [02:13<02:50, 2.31it/s, loss=0.0862, lr=0.0001] Steps: 44%|████▍ | 308/700 [02:13<02:49, 2.31it/s, loss=0.0862, lr=0.0001] Steps: 44%|████▍ | 308/700 [02:13<02:49, 2.31it/s, loss=0.0361, lr=0.0001] Steps: 44%|████▍ | 309/700 [02:14<02:49, 2.31it/s, loss=0.0361, lr=0.0001] Steps: 44%|████▍ | 309/700 [02:14<02:49, 2.31it/s, loss=0.229, lr=0.0001] Steps: 44%|████▍ | 310/700 [02:14<02:49, 2.31it/s, loss=0.229, lr=0.0001] Steps: 44%|████▍ | 310/700 [02:14<02:49, 2.31it/s, loss=0.133, lr=0.0001] Steps: 44%|████▍ | 311/700 [02:15<02:48, 2.31it/s, loss=0.133, lr=0.0001] Steps: 44%|████▍ | 311/700 [02:15<02:48, 2.31it/s, loss=0.163, lr=0.0001] Steps: 45%|████▍ | 312/700 [02:15<02:47, 2.31it/s, loss=0.163, lr=0.0001] Steps: 45%|████▍ | 312/700 [02:15<02:47, 2.31it/s, loss=0.116, lr=0.0001] Steps: 45%|████▍ | 313/700 [02:15<02:47, 2.31it/s, loss=0.116, lr=0.0001] Steps: 45%|████▍ | 313/700 [02:15<02:47, 2.31it/s, loss=0.309, lr=0.0001] Steps: 45%|████▍ | 314/700 [02:16<02:47, 2.31it/s, loss=0.309, lr=0.0001] Steps: 45%|████▍ | 314/700 [02:16<02:47, 2.31it/s, loss=0.0657, lr=0.0001] Steps: 45%|████▌ | 315/700 [02:16<02:46, 2.31it/s, loss=0.0657, lr=0.0001] Steps: 45%|████▌ | 315/700 [02:16<02:46, 2.31it/s, loss=0.0988, lr=0.0001] Steps: 45%|████▌ | 316/700 [02:17<02:46, 2.31it/s, loss=0.0988, lr=0.0001] Steps: 45%|████▌ | 316/700 [02:17<02:46, 2.31it/s, loss=0.103, lr=0.0001] Steps: 45%|████▌ | 317/700 [02:17<02:45, 2.31it/s, loss=0.103, lr=0.0001] Steps: 45%|████▌ | 317/700 [02:17<02:45, 2.31it/s, loss=0.282, lr=0.0001] Steps: 45%|████▌ | 318/700 [02:18<02:45, 2.31it/s, loss=0.282, lr=0.0001] Steps: 45%|████▌ | 318/700 [02:18<02:45, 2.31it/s, loss=0.162, lr=0.0001] Steps: 46%|████▌ | 319/700 [02:18<02:45, 2.31it/s, loss=0.162, lr=0.0001] Steps: 46%|████▌ | 319/700 [02:18<02:45, 2.31it/s, loss=0.11, lr=0.0001] Steps: 46%|████▌ | 320/700 [02:18<02:44, 2.31it/s, loss=0.11, lr=0.0001] Steps: 46%|████▌ | 320/700 [02:18<02:44, 2.31it/s, loss=0.165, lr=0.0001] Steps: 46%|████▌ | 321/700 [02:19<02:44, 2.30it/s, loss=0.165, lr=0.0001] Steps: 46%|████▌ | 321/700 [02:19<02:44, 2.30it/s, loss=0.105, lr=0.0001] Steps: 46%|████▌ | 322/700 [02:19<02:44, 2.30it/s, loss=0.105, lr=0.0001] Steps: 46%|████▌ | 322/700 [02:19<02:44, 2.30it/s, loss=0.246, lr=0.0001] Steps: 46%|████▌ | 323/700 [02:20<02:43, 2.30it/s, loss=0.246, lr=0.0001] Steps: 46%|████▌ | 323/700 [02:20<02:43, 2.30it/s, loss=0.0769, lr=0.0001] Steps: 46%|████▋ | 324/700 [02:20<02:43, 2.31it/s, loss=0.0769, lr=0.0001] Steps: 46%|████▋ | 324/700 [02:20<02:43, 2.31it/s, loss=0.101, lr=0.0001] Steps: 46%|████▋ | 325/700 [02:21<02:42, 2.31it/s, loss=0.101, lr=0.0001] Steps: 46%|████▋ | 325/700 [02:21<02:42, 2.31it/s, loss=0.161, lr=0.0001] Steps: 47%|████▋ | 326/700 [02:21<02:42, 2.31it/s, loss=0.161, lr=0.0001] Steps: 47%|████▋ | 326/700 [02:21<02:42, 2.31it/s, loss=0.175, lr=0.0001] Steps: 47%|████▋ | 327/700 [02:22<02:41, 2.31it/s, loss=0.175, lr=0.0001] Steps: 47%|████▋ | 327/700 [02:22<02:41, 2.31it/s, loss=0.147, lr=0.0001] Steps: 47%|████▋ | 328/700 [02:22<02:40, 2.31it/s, loss=0.147, lr=0.0001] Steps: 47%|████▋ | 328/700 [02:22<02:40, 2.31it/s, loss=0.258, lr=0.0001] Steps: 47%|████▋ | 329/700 [02:22<02:40, 2.31it/s, loss=0.258, lr=0.0001] Steps: 47%|████▋ | 329/700 [02:22<02:40, 2.31it/s, loss=0.117, lr=0.0001] Steps: 47%|████▋ | 330/700 [02:23<02:40, 2.31it/s, loss=0.117, lr=0.0001] Steps: 47%|████▋ | 330/700 [02:23<02:40, 2.31it/s, loss=0.0967, lr=0.0001] Steps: 47%|████▋ | 331/700 [02:23<02:39, 2.31it/s, loss=0.0967, lr=0.0001] Steps: 47%|████▋ | 331/700 [02:23<02:39, 2.31it/s, loss=0.0688, lr=0.0001] Steps: 47%|████▋ | 332/700 [02:24<02:39, 2.31it/s, loss=0.0688, lr=0.0001] Steps: 47%|████▋ | 332/700 [02:24<02:39, 2.31it/s, loss=0.102, lr=0.0001] Steps: 48%|████▊ | 333/700 [02:24<02:38, 2.31it/s, loss=0.102, lr=0.0001] Steps: 48%|████▊ | 333/700 [02:24<02:38, 2.31it/s, loss=0.0854, lr=0.0001] Steps: 48%|████▊ | 334/700 [02:25<02:38, 2.31it/s, loss=0.0854, lr=0.0001] Steps: 48%|████▊ | 334/700 [02:25<02:38, 2.31it/s, loss=0.0907, lr=0.0001] Steps: 48%|████▊ | 335/700 [02:25<02:37, 2.31it/s, loss=0.0907, lr=0.0001] Steps: 48%|████▊ | 335/700 [02:25<02:37, 2.31it/s, loss=0.243, lr=0.0001] Steps: 48%|████▊ | 336/700 [02:25<02:37, 2.31it/s, loss=0.243, lr=0.0001] Steps: 48%|████▊ | 336/700 [02:25<02:37, 2.31it/s, loss=0.182, lr=0.0001] Steps: 48%|████▊ | 337/700 [02:26<02:37, 2.30it/s, loss=0.182, lr=0.0001] Steps: 48%|████▊ | 337/700 [02:26<02:37, 2.30it/s, loss=0.165, lr=0.0001] Steps: 48%|████▊ | 338/700 [02:26<02:37, 2.30it/s, loss=0.165, lr=0.0001] Steps: 48%|████▊ | 338/700 [02:26<02:37, 2.30it/s, loss=0.116, lr=0.0001] Steps: 48%|████▊ | 339/700 [02:27<02:36, 2.31it/s, loss=0.116, lr=0.0001] Steps: 48%|████▊ | 339/700 [02:27<02:36, 2.31it/s, loss=0.0656, lr=0.0001] Steps: 49%|████▊ | 340/700 [02:27<02:36, 2.31it/s, loss=0.0656, lr=0.0001] Steps: 49%|████▊ | 340/700 [02:27<02:36, 2.31it/s, loss=0.0485, lr=0.0001] Steps: 49%|████▊ | 341/700 [02:28<02:35, 2.31it/s, loss=0.0485, lr=0.0001] Steps: 49%|████▊ | 341/700 [02:28<02:35, 2.31it/s, loss=0.0723, lr=0.0001] Steps: 49%|████▉ | 342/700 [02:28<02:34, 2.31it/s, loss=0.0723, lr=0.0001] Steps: 49%|████▉ | 342/700 [02:28<02:34, 2.31it/s, loss=0.057, lr=0.0001] Steps: 49%|████▉ | 343/700 [02:28<02:34, 2.31it/s, loss=0.057, lr=0.0001] Steps: 49%|████▉ | 343/700 [02:28<02:34, 2.31it/s, loss=0.159, lr=0.0001] Steps: 49%|████▉ | 344/700 [02:29<02:34, 2.31it/s, loss=0.159, lr=0.0001] Steps: 49%|████▉ | 344/700 [02:29<02:34, 2.31it/s, loss=0.193, lr=0.0001] Steps: 49%|████▉ | 345/700 [02:29<02:33, 2.31it/s, loss=0.193, lr=0.0001] Steps: 49%|████▉ | 345/700 [02:29<02:33, 2.31it/s, loss=0.236, lr=0.0001] Steps: 49%|████▉ | 346/700 [02:30<02:33, 2.31it/s, loss=0.236, lr=0.0001] Steps: 49%|████▉ | 346/700 [02:30<02:33, 2.31it/s, loss=0.108, lr=0.0001] Steps: 50%|████▉ | 347/700 [02:30<02:33, 2.31it/s, loss=0.108, lr=0.0001] Steps: 50%|████▉ | 347/700 [02:30<02:33, 2.31it/s, loss=0.0848, lr=0.0001] Steps: 50%|████▉ | 348/700 [02:31<02:32, 2.31it/s, loss=0.0848, lr=0.0001] Steps: 50%|████▉ | 348/700 [02:31<02:32, 2.31it/s, loss=0.135, lr=0.0001] Steps: 50%|████▉ | 349/700 [02:31<02:32, 2.31it/s, loss=0.135, lr=0.0001] Steps: 50%|████▉ | 349/700 [02:31<02:32, 2.31it/s, loss=0.141, lr=0.0001] Steps: 50%|█████ | 350/700 [02:31<02:31, 2.31it/s, loss=0.141, lr=0.0001] Steps: 50%|█████ | 350/700 [02:31<02:31, 2.31it/s, loss=0.0529, lr=0.0001] Steps: 50%|█████ | 351/700 [02:32<02:31, 2.31it/s, loss=0.0529, lr=0.0001] Steps: 50%|█████ | 351/700 [02:32<02:31, 2.31it/s, loss=0.0894, lr=0.0001] Steps: 50%|█████ | 352/700 [02:32<02:30, 2.31it/s, loss=0.0894, lr=0.0001] Steps: 50%|█████ | 352/700 [02:32<02:30, 2.31it/s, loss=0.343, lr=0.0001] Steps: 50%|█████ | 353/700 [02:33<02:30, 2.30it/s, loss=0.343, lr=0.0001] Steps: 50%|█████ | 353/700 [02:33<02:30, 2.30it/s, loss=0.195, lr=0.0001] Steps: 51%|█████ | 354/700 [02:33<02:30, 2.30it/s, loss=0.195, lr=0.0001] Steps: 51%|█████ | 354/700 [02:33<02:30, 2.30it/s, loss=0.107, lr=0.0001] Steps: 51%|█████ | 355/700 [02:34<02:29, 2.30it/s, loss=0.107, lr=0.0001] Steps: 51%|█████ | 355/700 [02:34<02:29, 2.30it/s, loss=0.0284, lr=0.0001] Steps: 51%|█████ | 356/700 [02:34<02:29, 2.31it/s, loss=0.0284, lr=0.0001] Steps: 51%|█████ | 356/700 [02:34<02:29, 2.31it/s, loss=0.167, lr=0.0001] Steps: 51%|█████ | 357/700 [02:35<02:28, 2.31it/s, loss=0.167, lr=0.0001] Steps: 51%|█████ | 357/700 [02:35<02:28, 2.31it/s, loss=0.14, lr=0.0001] Steps: 51%|█████ | 358/700 [02:35<02:28, 2.31it/s, loss=0.14, lr=0.0001] Steps: 51%|█████ | 358/700 [02:35<02:28, 2.31it/s, loss=0.111, lr=0.0001] Steps: 51%|█████▏ | 359/700 [02:35<02:27, 2.31it/s, loss=0.111, lr=0.0001] Steps: 51%|█████▏ | 359/700 [02:35<02:27, 2.31it/s, loss=0.199, lr=0.0001] Steps: 51%|█████▏ | 360/700 [02:36<02:27, 2.31it/s, loss=0.199, lr=0.0001] Steps: 51%|█████▏ | 360/700 [02:36<02:27, 2.31it/s, loss=0.2, lr=0.0001] Steps: 52%|█████▏ | 361/700 [02:36<02:26, 2.31it/s, loss=0.2, lr=0.0001] Steps: 52%|█████▏ | 361/700 [02:36<02:26, 2.31it/s, loss=0.0617, lr=0.0001] Steps: 52%|█████▏ | 362/700 [02:37<02:26, 2.31it/s, loss=0.0617, lr=0.0001] Steps: 52%|█████▏ | 362/700 [02:37<02:26, 2.31it/s, loss=0.202, lr=0.0001] Steps: 52%|█████▏ | 363/700 [02:37<02:25, 2.31it/s, loss=0.202, lr=0.0001] Steps: 52%|█████▏ | 363/700 [02:37<02:25, 2.31it/s, loss=0.081, lr=0.0001] Steps: 52%|█████▏ | 364/700 [02:38<02:25, 2.31it/s, loss=0.081, lr=0.0001] Steps: 52%|█████▏ | 364/700 [02:38<02:25, 2.31it/s, loss=0.158, lr=0.0001] Steps: 52%|█████▏ | 365/700 [02:38<02:25, 2.31it/s, loss=0.158, lr=0.0001] Steps: 52%|█████▏ | 365/700 [02:38<02:25, 2.31it/s, loss=0.111, lr=0.0001] Steps: 52%|█████▏ | 366/700 [02:38<02:24, 2.31it/s, loss=0.111, lr=0.0001] Steps: 52%|█████▏ | 366/700 [02:38<02:24, 2.31it/s, loss=0.166, lr=0.0001] Steps: 52%|█████▏ | 367/700 [02:39<02:24, 2.31it/s, loss=0.166, lr=0.0001] Steps: 52%|█████▏ | 367/700 [02:39<02:24, 2.31it/s, loss=0.261, lr=0.0001] Steps: 53%|█████▎ | 368/700 [02:39<02:23, 2.31it/s, loss=0.261, lr=0.0001] Steps: 53%|█████▎ | 368/700 [02:39<02:23, 2.31it/s, loss=0.119, lr=0.0001] Steps: 53%|█████▎ | 369/700 [02:40<02:24, 2.30it/s, loss=0.119, lr=0.0001] Steps: 53%|█████▎ | 369/700 [02:40<02:24, 2.30it/s, loss=0.0896, lr=0.0001] Steps: 53%|█████▎ | 370/700 [02:40<02:23, 2.30it/s, loss=0.0896, lr=0.0001] Steps: 53%|█████▎ | 370/700 [02:40<02:23, 2.30it/s, loss=0.101, lr=0.0001] Steps: 53%|█████▎ | 371/700 [02:41<02:23, 2.30it/s, loss=0.101, lr=0.0001] Steps: 53%|█████▎ | 371/700 [02:41<02:23, 2.30it/s, loss=0.112, lr=0.0001] Steps: 53%|█████▎ | 372/700 [02:41<02:22, 2.30it/s, loss=0.112, lr=0.0001] Steps: 53%|█████▎ | 372/700 [02:41<02:22, 2.30it/s, loss=0.132, lr=0.0001] Steps: 53%|█████▎ | 373/700 [02:41<02:21, 2.30it/s, loss=0.132, lr=0.0001] Steps: 53%|█████▎ | 373/700 [02:41<02:21, 2.30it/s, loss=0.15, lr=0.0001] Steps: 53%|█████▎ | 374/700 [02:42<02:21, 2.31it/s, loss=0.15, lr=0.0001] Steps: 53%|█████▎ | 374/700 [02:42<02:21, 2.31it/s, loss=0.326, lr=0.0001] Steps: 54%|█████▎ | 375/700 [02:42<02:20, 2.31it/s, loss=0.326, lr=0.0001] Steps: 54%|█████▎ | 375/700 [02:42<02:20, 2.31it/s, loss=0.117, lr=0.0001] Steps: 54%|█████▎ | 376/700 [02:43<02:20, 2.30it/s, loss=0.117, lr=0.0001] Steps: 54%|█████▎ | 376/700 [02:43<02:20, 2.30it/s, loss=0.128, lr=0.0001] Steps: 54%|█████▍ | 377/700 [02:43<02:20, 2.30it/s, loss=0.128, lr=0.0001] Steps: 54%|█████▍ | 377/700 [02:43<02:20, 2.30it/s, loss=0.146, lr=0.0001] Steps: 54%|█████▍ | 378/700 [02:44<02:19, 2.31it/s, loss=0.146, lr=0.0001] Steps: 54%|█████▍ | 378/700 [02:44<02:19, 2.31it/s, loss=0.219, lr=0.0001] Steps: 54%|█████▍ | 379/700 [02:44<02:19, 2.31it/s, loss=0.219, lr=0.0001] Steps: 54%|█████▍ | 379/700 [02:44<02:19, 2.31it/s, loss=0.0741, lr=0.0001] Steps: 54%|█████▍ | 380/700 [02:44<02:18, 2.31it/s, loss=0.0741, lr=0.0001] Steps: 54%|█████▍ | 380/700 [02:45<02:18, 2.31it/s, loss=0.104, lr=0.0001] Steps: 54%|█████▍ | 381/700 [02:45<02:18, 2.31it/s, loss=0.104, lr=0.0001] Steps: 54%|█████▍ | 381/700 [02:45<02:18, 2.31it/s, loss=0.0772, lr=0.0001] Steps: 55%|█████▍ | 382/700 [02:45<02:18, 2.30it/s, loss=0.0772, lr=0.0001] Steps: 55%|█████▍ | 382/700 [02:45<02:18, 2.30it/s, loss=0.213, lr=0.0001] Steps: 55%|█████▍ | 383/700 [02:46<02:29, 2.11it/s, loss=0.213, lr=0.0001] Steps: 55%|█████▍ | 383/700 [02:46<02:29, 2.11it/s, loss=0.197, lr=0.0001] Steps: 55%|█████▍ | 384/700 [02:46<02:25, 2.16it/s, loss=0.197, lr=0.0001] Steps: 55%|█████▍ | 384/700 [02:46<02:25, 2.16it/s, loss=0.172, lr=0.0001] Steps: 55%|█████▌ | 385/700 [02:47<02:23, 2.20it/s, loss=0.172, lr=0.0001] Steps: 55%|█████▌ | 385/700 [02:47<02:23, 2.20it/s, loss=0.108, lr=0.0001] Steps: 55%|█████▌ | 386/700 [02:47<02:20, 2.23it/s, loss=0.108, lr=0.0001] Steps: 55%|█████▌ | 386/700 [02:47<02:20, 2.23it/s, loss=0.0851, lr=0.0001] Steps: 55%|█████▌ | 387/700 [02:48<02:18, 2.25it/s, loss=0.0851, lr=0.0001] Steps: 55%|█████▌ | 387/700 [02:48<02:18, 2.25it/s, loss=0.037, lr=0.0001] Steps: 55%|█████▌ | 388/700 [02:48<02:17, 2.27it/s, loss=0.037, lr=0.0001] Steps: 55%|█████▌ | 388/700 [02:48<02:17, 2.27it/s, loss=0.278, lr=0.0001] Steps: 56%|█████▌ | 389/700 [02:49<02:16, 2.28it/s, loss=0.278, lr=0.0001] Steps: 56%|█████▌ | 389/700 [02:49<02:16, 2.28it/s, loss=0.0438, lr=0.0001] Steps: 56%|█████▌ | 390/700 [02:49<02:15, 2.29it/s, loss=0.0438, lr=0.0001] Steps: 56%|█████▌ | 390/700 [02:49<02:15, 2.29it/s, loss=0.171, lr=0.0001] Steps: 56%|█████▌ | 391/700 [02:49<02:14, 2.29it/s, loss=0.171, lr=0.0001] Steps: 56%|█████▌ | 391/700 [02:49<02:14, 2.29it/s, loss=0.0965, lr=0.0001] Steps: 56%|█████▌ | 392/700 [02:50<02:14, 2.30it/s, loss=0.0965, lr=0.0001] Steps: 56%|█████▌ | 392/700 [02:50<02:14, 2.30it/s, loss=0.061, lr=0.0001] Steps: 56%|█████▌ | 393/700 [02:50<02:13, 2.30it/s, loss=0.061, lr=0.0001] Steps: 56%|█████▌ | 393/700 [02:50<02:13, 2.30it/s, loss=0.0909, lr=0.0001] Steps: 56%|█████▋ | 394/700 [02:51<02:12, 2.30it/s, loss=0.0909, lr=0.0001] Steps: 56%|█████▋ | 394/700 [02:51<02:12, 2.30it/s, loss=0.0822, lr=0.0001] Steps: 56%|█████▋ | 395/700 [02:51<02:12, 2.31it/s, loss=0.0822, lr=0.0001] Steps: 56%|█████▋ | 395/700 [02:51<02:12, 2.31it/s, loss=0.0202, lr=0.0001] Steps: 57%|█████▋ | 396/700 [02:52<02:11, 2.31it/s, loss=0.0202, lr=0.0001] Steps: 57%|█████▋ | 396/700 [02:52<02:11, 2.31it/s, loss=0.084, lr=0.0001] Steps: 57%|█████▋ | 397/700 [02:52<02:11, 2.31it/s, loss=0.084, lr=0.0001] Steps: 57%|█████▋ | 397/700 [02:52<02:11, 2.31it/s, loss=0.165, lr=0.0001] Steps: 57%|█████▋ | 398/700 [02:52<02:10, 2.31it/s, loss=0.165, lr=0.0001] Steps: 57%|█████▋ | 398/700 [02:52<02:10, 2.31it/s, loss=0.121, lr=0.0001] Steps: 57%|█████▋ | 399/700 [02:53<02:10, 2.31it/s, loss=0.121, lr=0.0001] Steps: 57%|█████▋ | 399/700 [02:53<02:10, 2.31it/s, loss=0.17, lr=0.0001] Steps: 57%|█████▋ | 400/700 [02:53<02:09, 2.31it/s, loss=0.17, lr=0.0001] Steps: 57%|█████▋ | 400/700 [02:53<02:09, 2.31it/s, loss=0.176, lr=0.0001] Steps: 57%|█████▋ | 401/700 [02:54<02:10, 2.30it/s, loss=0.176, lr=0.0001] Steps: 57%|█████▋ | 401/700 [02:54<02:10, 2.30it/s, loss=0.165, lr=0.0001] Steps: 57%|█████▋ | 402/700 [02:54<02:09, 2.30it/s, loss=0.165, lr=0.0001] Steps: 57%|█████▋ | 402/700 [02:54<02:09, 2.30it/s, loss=0.0535, lr=0.0001] Steps: 58%|█████▊ | 403/700 [02:55<02:08, 2.31it/s, loss=0.0535, lr=0.0001] Steps: 58%|█████▊ | 403/700 [02:55<02:08, 2.31it/s, loss=0.15, lr=0.0001] Steps: 58%|█████▊ | 404/700 [02:55<02:08, 2.31it/s, loss=0.15, lr=0.0001] Steps: 58%|█████▊ | 404/700 [02:55<02:08, 2.31it/s, loss=0.122, lr=0.0001] Steps: 58%|█████▊ | 405/700 [02:55<02:07, 2.31it/s, loss=0.122, lr=0.0001] Steps: 58%|█████▊ | 405/700 [02:55<02:07, 2.31it/s, loss=0.111, lr=0.0001] Steps: 58%|█████▊ | 406/700 [02:56<02:07, 2.31it/s, loss=0.111, lr=0.0001] Steps: 58%|█████▊ | 406/700 [02:56<02:07, 2.31it/s, loss=0.148, lr=0.0001] Steps: 58%|█████▊ | 407/700 [02:56<02:06, 2.31it/s, loss=0.148, lr=0.0001] Steps: 58%|█████▊ | 407/700 [02:56<02:06, 2.31it/s, loss=0.135, lr=0.0001] Steps: 58%|█████▊ | 408/700 [02:57<02:06, 2.31it/s, loss=0.135, lr=0.0001] Steps: 58%|█████▊ | 408/700 [02:57<02:06, 2.31it/s, loss=0.0779, lr=0.0001] Steps: 58%|█████▊ | 409/700 [02:57<02:05, 2.31it/s, loss=0.0779, lr=0.0001] Steps: 58%|█████▊ | 409/700 [02:57<02:05, 2.31it/s, loss=0.125, lr=0.0001] Steps: 59%|█████▊ | 410/700 [02:58<02:05, 2.31it/s, loss=0.125, lr=0.0001] Steps: 59%|█████▊ | 410/700 [02:58<02:05, 2.31it/s, loss=0.116, lr=0.0001] Steps: 59%|█████▊ | 411/700 [02:58<02:05, 2.31it/s, loss=0.116, lr=0.0001] Steps: 59%|█████▊ | 411/700 [02:58<02:05, 2.31it/s, loss=0.187, lr=0.0001] Steps: 59%|█████▉ | 412/700 [02:58<02:04, 2.31it/s, loss=0.187, lr=0.0001] Steps: 59%|█████▉ | 412/700 [02:59<02:04, 2.31it/s, loss=0.0657, lr=0.0001] Steps: 59%|█████▉ | 413/700 [02:59<02:04, 2.31it/s, loss=0.0657, lr=0.0001] Steps: 59%|█████▉ | 413/700 [02:59<02:04, 2.31it/s, loss=0.0886, lr=0.0001] Steps: 59%|█████▉ | 414/700 [02:59<02:03, 2.31it/s, loss=0.0886, lr=0.0001] Steps: 59%|█████▉ | 414/700 [02:59<02:03, 2.31it/s, loss=0.127, lr=0.0001] Steps: 59%|█████▉ | 415/700 [03:00<02:03, 2.31it/s, loss=0.127, lr=0.0001] Steps: 59%|█████▉ | 415/700 [03:00<02:03, 2.31it/s, loss=0.0474, lr=0.0001] Steps: 59%|█████▉ | 416/700 [03:00<02:02, 2.31it/s, loss=0.0474, lr=0.0001] Steps: 59%|█████▉ | 416/700 [03:00<02:02, 2.31it/s, loss=0.135, lr=0.0001] Steps: 60%|█████▉ | 417/700 [03:01<02:03, 2.30it/s, loss=0.135, lr=0.0001] Steps: 60%|█████▉ | 417/700 [03:01<02:03, 2.30it/s, loss=0.127, lr=0.0001] Steps: 60%|█████▉ | 418/700 [03:01<02:02, 2.30it/s, loss=0.127, lr=0.0001] Steps: 60%|█████▉ | 418/700 [03:01<02:02, 2.30it/s, loss=0.136, lr=0.0001] Steps: 60%|█████▉ | 419/700 [03:02<02:01, 2.31it/s, loss=0.136, lr=0.0001] Steps: 60%|█████▉ | 419/700 [03:02<02:01, 2.31it/s, loss=0.197, lr=0.0001] Steps: 60%|██████ | 420/700 [03:02<02:01, 2.31it/s, loss=0.197, lr=0.0001] Steps: 60%|██████ | 420/700 [03:02<02:01, 2.31it/s, loss=0.0675, lr=0.0001] Steps: 60%|██████ | 421/700 [03:02<02:00, 2.31it/s, loss=0.0675, lr=0.0001] Steps: 60%|██████ | 421/700 [03:02<02:00, 2.31it/s, loss=0.0898, lr=0.0001] Steps: 60%|██████ | 422/700 [03:03<02:00, 2.31it/s, loss=0.0898, lr=0.0001] Steps: 60%|██████ | 422/700 [03:03<02:00, 2.31it/s, loss=0.118, lr=0.0001] Steps: 60%|██████ | 423/700 [03:03<01:59, 2.31it/s, loss=0.118, lr=0.0001] Steps: 60%|██████ | 423/700 [03:03<01:59, 2.31it/s, loss=0.14, lr=0.0001] Steps: 61%|██████ | 424/700 [03:04<01:59, 2.31it/s, loss=0.14, lr=0.0001] Steps: 61%|██████ | 424/700 [03:04<01:59, 2.31it/s, loss=0.0937, lr=0.0001] Steps: 61%|██████ | 425/700 [03:04<01:59, 2.31it/s, loss=0.0937, lr=0.0001] Steps: 61%|██████ | 425/700 [03:04<01:59, 2.31it/s, loss=0.138, lr=0.0001] Steps: 61%|██████ | 426/700 [03:05<01:58, 2.31it/s, loss=0.138, lr=0.0001] Steps: 61%|██████ | 426/700 [03:05<01:58, 2.31it/s, loss=0.158, lr=0.0001] Steps: 61%|██████ | 427/700 [03:05<01:58, 2.31it/s, loss=0.158, lr=0.0001] Steps: 61%|██████ | 427/700 [03:05<01:58, 2.31it/s, loss=0.0508, lr=0.0001] Steps: 61%|██████ | 428/700 [03:05<01:57, 2.31it/s, loss=0.0508, lr=0.0001] Steps: 61%|██████ | 428/700 [03:05<01:57, 2.31it/s, loss=0.0954, lr=0.0001] Steps: 61%|██████▏ | 429/700 [03:06<01:57, 2.31it/s, loss=0.0954, lr=0.0001] Steps: 61%|██████▏ | 429/700 [03:06<01:57, 2.31it/s, loss=0.315, lr=0.0001] Steps: 61%|██████▏ | 430/700 [03:06<01:56, 2.31it/s, loss=0.315, lr=0.0001] Steps: 61%|██████▏ | 430/700 [03:06<01:56, 2.31it/s, loss=0.166, lr=0.0001] Steps: 62%|██████▏ | 431/700 [03:07<01:56, 2.31it/s, loss=0.166, lr=0.0001] Steps: 62%|██████▏ | 431/700 [03:07<01:56, 2.31it/s, loss=0.09, lr=0.0001] Steps: 62%|██████▏ | 432/700 [03:07<01:55, 2.31it/s, loss=0.09, lr=0.0001] Steps: 62%|██████▏ | 432/700 [03:07<01:55, 2.31it/s, loss=0.0611, lr=0.0001] Steps: 62%|██████▏ | 433/700 [03:08<01:56, 2.30it/s, loss=0.0611, lr=0.0001] Steps: 62%|██████▏ | 433/700 [03:08<01:56, 2.30it/s, loss=0.23, lr=0.0001] Steps: 62%|██████▏ | 434/700 [03:08<01:55, 2.30it/s, loss=0.23, lr=0.0001] Steps: 62%|██████▏ | 434/700 [03:08<01:55, 2.30it/s, loss=0.221, lr=0.0001] Steps: 62%|██████▏ | 435/700 [03:08<01:55, 2.30it/s, loss=0.221, lr=0.0001] Steps: 62%|██████▏ | 435/700 [03:08<01:55, 2.30it/s, loss=0.0432, lr=0.0001] Steps: 62%|██████▏ | 436/700 [03:09<01:54, 2.31it/s, loss=0.0432, lr=0.0001] Steps: 62%|██████▏ | 436/700 [03:09<01:54, 2.31it/s, loss=0.127, lr=0.0001] Steps: 62%|██████▏ | 437/700 [03:09<01:53, 2.31it/s, loss=0.127, lr=0.0001] Steps: 62%|██████▏ | 437/700 [03:09<01:53, 2.31it/s, loss=0.121, lr=0.0001] Steps: 63%|██████▎ | 438/700 [03:10<01:53, 2.31it/s, loss=0.121, lr=0.0001] Steps: 63%|██████▎ | 438/700 [03:10<01:53, 2.31it/s, loss=0.104, lr=0.0001] Steps: 63%|██████▎ | 439/700 [03:10<01:53, 2.31it/s, loss=0.104, lr=0.0001] Steps: 63%|██████▎ | 439/700 [03:10<01:53, 2.31it/s, loss=0.0318, lr=0.0001] Steps: 63%|██████▎ | 440/700 [03:11<01:52, 2.31it/s, loss=0.0318, lr=0.0001] Steps: 63%|██████▎ | 440/700 [03:11<01:52, 2.31it/s, loss=0.109, lr=0.0001] Steps: 63%|██████▎ | 441/700 [03:11<01:52, 2.31it/s, loss=0.109, lr=0.0001] Steps: 63%|██████▎ | 441/700 [03:11<01:52, 2.31it/s, loss=0.0869, lr=0.0001] Steps: 63%|██████▎ | 442/700 [03:11<01:51, 2.31it/s, loss=0.0869, lr=0.0001] Steps: 63%|██████▎ | 442/700 [03:12<01:51, 2.31it/s, loss=0.0479, lr=0.0001] Steps: 63%|██████▎ | 443/700 [03:12<01:51, 2.31it/s, loss=0.0479, lr=0.0001] Steps: 63%|██████▎ | 443/700 [03:12<01:51, 2.31it/s, loss=0.0615, lr=0.0001] Steps: 63%|██████▎ | 444/700 [03:12<01:50, 2.31it/s, loss=0.0615, lr=0.0001] Steps: 63%|██████▎ | 444/700 [03:12<01:50, 2.31it/s, loss=0.0695, lr=0.0001] Steps: 64%|██████▎ | 445/700 [03:13<01:50, 2.31it/s, loss=0.0695, lr=0.0001] Steps: 64%|██████▎ | 445/700 [03:13<01:50, 2.31it/s, loss=0.109, lr=0.0001] Steps: 64%|██████▎ | 446/700 [03:13<01:49, 2.31it/s, loss=0.109, lr=0.0001] Steps: 64%|██████▎ | 446/700 [03:13<01:49, 2.31it/s, loss=0.155, lr=0.0001] Steps: 64%|██████▍ | 447/700 [03:14<01:49, 2.31it/s, loss=0.155, lr=0.0001] Steps: 64%|██████▍ | 447/700 [03:14<01:49, 2.31it/s, loss=0.0106, lr=0.0001] Steps: 64%|██████▍ | 448/700 [03:14<01:49, 2.31it/s, loss=0.0106, lr=0.0001] Steps: 64%|██████▍ | 448/700 [03:14<01:49, 2.31it/s, loss=0.176, lr=0.0001] Steps: 64%|██████▍ | 449/700 [03:15<01:49, 2.30it/s, loss=0.176, lr=0.0001] Steps: 64%|██████▍ | 449/700 [03:15<01:49, 2.30it/s, loss=0.193, lr=0.0001] Steps: 64%|██████▍ | 450/700 [03:15<01:48, 2.30it/s, loss=0.193, lr=0.0001] Steps: 64%|██████▍ | 450/700 [03:15<01:48, 2.30it/s, loss=0.104, lr=0.0001] Steps: 64%|██████▍ | 451/700 [03:15<01:47, 2.31it/s, loss=0.104, lr=0.0001] Steps: 64%|██████▍ | 451/700 [03:15<01:47, 2.31it/s, loss=0.0734, lr=0.0001] Steps: 65%|██████▍ | 452/700 [03:16<01:47, 2.31it/s, loss=0.0734, lr=0.0001] Steps: 65%|██████▍ | 452/700 [03:16<01:47, 2.31it/s, loss=0.272, lr=0.0001] Steps: 65%|██████▍ | 453/700 [03:16<01:47, 2.31it/s, loss=0.272, lr=0.0001] Steps: 65%|██████▍ | 453/700 [03:16<01:47, 2.31it/s, loss=0.0395, lr=0.0001] Steps: 65%|██████▍ | 454/700 [03:17<01:46, 2.31it/s, loss=0.0395, lr=0.0001] Steps: 65%|██████▍ | 454/700 [03:17<01:46, 2.31it/s, loss=0.118, lr=0.0001] Steps: 65%|██████▌ | 455/700 [03:17<01:46, 2.31it/s, loss=0.118, lr=0.0001] Steps: 65%|██████▌ | 455/700 [03:17<01:46, 2.31it/s, loss=0.0978, lr=0.0001] Steps: 65%|██████▌ | 456/700 [03:18<01:45, 2.31it/s, loss=0.0978, lr=0.0001] Steps: 65%|██████▌ | 456/700 [03:18<01:45, 2.31it/s, loss=0.152, lr=0.0001] Steps: 65%|██████▌ | 457/700 [03:18<01:45, 2.31it/s, loss=0.152, lr=0.0001] Steps: 65%|██████▌ | 457/700 [03:18<01:45, 2.31it/s, loss=0.095, lr=0.0001] Steps: 65%|██████▌ | 458/700 [03:18<01:44, 2.31it/s, loss=0.095, lr=0.0001] Steps: 65%|██████▌ | 458/700 [03:18<01:44, 2.31it/s, loss=0.178, lr=0.0001] Steps: 66%|██████▌ | 459/700 [03:19<01:44, 2.31it/s, loss=0.178, lr=0.0001] Steps: 66%|██████▌ | 459/700 [03:19<01:44, 2.31it/s, loss=0.161, lr=0.0001] Steps: 66%|██████▌ | 460/700 [03:19<01:43, 2.31it/s, loss=0.161, lr=0.0001] Steps: 66%|██████▌ | 460/700 [03:19<01:43, 2.31it/s, loss=0.135, lr=0.0001] Steps: 66%|██████▌ | 461/700 [03:20<01:43, 2.31it/s, loss=0.135, lr=0.0001] Steps: 66%|██████▌ | 461/700 [03:20<01:43, 2.31it/s, loss=0.165, lr=0.0001] Steps: 66%|██████▌ | 462/700 [03:20<01:43, 2.31it/s, loss=0.165, lr=0.0001] Steps: 66%|██████▌ | 462/700 [03:20<01:43, 2.31it/s, loss=0.162, lr=0.0001] Steps: 66%|██████▌ | 463/700 [03:21<01:42, 2.31it/s, loss=0.162, lr=0.0001] Steps: 66%|██████▌ | 463/700 [03:21<01:42, 2.31it/s, loss=0.177, lr=0.0001] Steps: 66%|██████▋ | 464/700 [03:21<01:42, 2.31it/s, loss=0.177, lr=0.0001] Steps: 66%|██████▋ | 464/700 [03:21<01:42, 2.31it/s, loss=0.158, lr=0.0001] Steps: 66%|██████▋ | 465/700 [03:21<01:42, 2.30it/s, loss=0.158, lr=0.0001] Steps: 66%|██████▋ | 465/700 [03:21<01:42, 2.30it/s, loss=0.203, lr=0.0001] Steps: 67%|██████▋ | 466/700 [03:22<01:41, 2.30it/s, loss=0.203, lr=0.0001] Steps: 67%|██████▋ | 466/700 [03:22<01:41, 2.30it/s, loss=0.0449, lr=0.0001] Steps: 67%|██████▋ | 467/700 [03:22<01:41, 2.31it/s, loss=0.0449, lr=0.0001] Steps: 67%|██████▋ | 467/700 [03:22<01:41, 2.31it/s, loss=0.259, lr=0.0001] Steps: 67%|██████▋ | 468/700 [03:23<01:40, 2.31it/s, loss=0.259, lr=0.0001] Steps: 67%|██████▋ | 468/700 [03:23<01:40, 2.31it/s, loss=0.177, lr=0.0001] Steps: 67%|██████▋ | 469/700 [03:23<01:40, 2.31it/s, loss=0.177, lr=0.0001] Steps: 67%|██████▋ | 469/700 [03:23<01:40, 2.31it/s, loss=0.118, lr=0.0001] Steps: 67%|██████▋ | 470/700 [03:24<01:39, 2.31it/s, loss=0.118, lr=0.0001] Steps: 67%|██████▋ | 470/700 [03:24<01:39, 2.31it/s, loss=0.164, lr=0.0001] Steps: 67%|██████▋ | 471/700 [03:24<01:39, 2.31it/s, loss=0.164, lr=0.0001] Steps: 67%|██████▋ | 471/700 [03:24<01:39, 2.31it/s, loss=0.0637, lr=0.0001] Steps: 67%|██████▋ | 472/700 [03:24<01:38, 2.31it/s, loss=0.0637, lr=0.0001] Steps: 67%|██████▋ | 472/700 [03:25<01:38, 2.31it/s, loss=0.101, lr=0.0001] Steps: 68%|██████▊ | 473/700 [03:25<01:38, 2.31it/s, loss=0.101, lr=0.0001] Steps: 68%|██████▊ | 473/700 [03:25<01:38, 2.31it/s, loss=0.197, lr=0.0001] Steps: 68%|██████▊ | 474/700 [03:25<01:37, 2.31it/s, loss=0.197, lr=0.0001] Steps: 68%|██████▊ | 474/700 [03:25<01:37, 2.31it/s, loss=0.246, lr=0.0001] Steps: 68%|██████▊ | 475/700 [03:26<01:37, 2.31it/s, loss=0.246, lr=0.0001] Steps: 68%|██████▊ | 475/700 [03:26<01:37, 2.31it/s, loss=0.0803, lr=0.0001] Steps: 68%|██████▊ | 476/700 [03:26<01:36, 2.31it/s, loss=0.0803, lr=0.0001] Steps: 68%|██████▊ | 476/700 [03:26<01:36, 2.31it/s, loss=0.131, lr=0.0001] Steps: 68%|██████▊ | 477/700 [03:27<01:36, 2.31it/s, loss=0.131, lr=0.0001] Steps: 68%|██████▊ | 477/700 [03:27<01:36, 2.31it/s, loss=0.0571, lr=0.0001] Steps: 68%|██████▊ | 478/700 [03:27<01:36, 2.31it/s, loss=0.0571, lr=0.0001] Steps: 68%|██████▊ | 478/700 [03:27<01:36, 2.31it/s, loss=0.126, lr=0.0001] Steps: 68%|██████▊ | 479/700 [03:27<01:35, 2.31it/s, loss=0.126, lr=0.0001] Steps: 68%|██████▊ | 479/700 [03:28<01:35, 2.31it/s, loss=0.148, lr=0.0001] Steps: 69%|██████▊ | 480/700 [03:28<01:35, 2.31it/s, loss=0.148, lr=0.0001] Steps: 69%|██████▊ | 480/700 [03:28<01:35, 2.31it/s, loss=0.0757, lr=0.0001] Steps: 69%|██████▊ | 481/700 [03:28<01:35, 2.30it/s, loss=0.0757, lr=0.0001] Steps: 69%|██████▊ | 481/700 [03:28<01:35, 2.30it/s, loss=0.118, lr=0.0001] Steps: 69%|██████▉ | 482/700 [03:29<01:34, 2.30it/s, loss=0.118, lr=0.0001] Steps: 69%|██████▉ | 482/700 [03:29<01:34, 2.30it/s, loss=0.233, lr=0.0001] Steps: 69%|██████▉ | 483/700 [03:29<01:34, 2.30it/s, loss=0.233, lr=0.0001] Steps: 69%|██████▉ | 483/700 [03:29<01:34, 2.30it/s, loss=0.146, lr=0.0001] Steps: 69%|██████▉ | 484/700 [03:30<01:33, 2.31it/s, loss=0.146, lr=0.0001] Steps: 69%|██████▉ | 484/700 [03:30<01:33, 2.31it/s, loss=0.129, lr=0.0001] Steps: 69%|██████▉ | 485/700 [03:30<01:33, 2.31it/s, loss=0.129, lr=0.0001] Steps: 69%|██████▉ | 485/700 [03:30<01:33, 2.31it/s, loss=0.179, lr=0.0001] Steps: 69%|██████▉ | 486/700 [03:31<01:32, 2.31it/s, loss=0.179, lr=0.0001] Steps: 69%|██████▉ | 486/700 [03:31<01:32, 2.31it/s, loss=0.0674, lr=0.0001] Steps: 70%|██████▉ | 487/700 [03:31<01:32, 2.31it/s, loss=0.0674, lr=0.0001] Steps: 70%|██████▉ | 487/700 [03:31<01:32, 2.31it/s, loss=0.187, lr=0.0001] Steps: 70%|██████▉ | 488/700 [03:31<01:31, 2.31it/s, loss=0.187, lr=0.0001] Steps: 70%|██████▉ | 488/700 [03:31<01:31, 2.31it/s, loss=0.106, lr=0.0001] Steps: 70%|██████▉ | 489/700 [03:32<01:31, 2.31it/s, loss=0.106, lr=0.0001] Steps: 70%|██████▉ | 489/700 [03:32<01:31, 2.31it/s, loss=0.0499, lr=0.0001] Steps: 70%|███████ | 490/700 [03:32<01:30, 2.31it/s, loss=0.0499, lr=0.0001] Steps: 70%|███████ | 490/700 [03:32<01:30, 2.31it/s, loss=0.11, lr=0.0001] Steps: 70%|███████ | 491/700 [03:33<01:30, 2.31it/s, loss=0.11, lr=0.0001] Steps: 70%|███████ | 491/700 [03:33<01:30, 2.31it/s, loss=0.0632, lr=0.0001] Steps: 70%|███████ | 492/700 [03:33<01:30, 2.31it/s, loss=0.0632, lr=0.0001] Steps: 70%|███████ | 492/700 [03:33<01:30, 2.31it/s, loss=0.0964, lr=0.0001] Steps: 70%|███████ | 493/700 [03:34<01:29, 2.31it/s, loss=0.0964, lr=0.0001] Steps: 70%|███████ | 493/700 [03:34<01:29, 2.31it/s, loss=0.0333, lr=0.0001] Steps: 71%|███████ | 494/700 [03:34<01:29, 2.31it/s, loss=0.0333, lr=0.0001] Steps: 71%|███████ | 494/700 [03:34<01:29, 2.31it/s, loss=0.094, lr=0.0001] Steps: 71%|███████ | 495/700 [03:34<01:28, 2.31it/s, loss=0.094, lr=0.0001] Steps: 71%|███████ | 495/700 [03:34<01:28, 2.31it/s, loss=0.115, lr=0.0001] Steps: 71%|███████ | 496/700 [03:35<01:28, 2.31it/s, loss=0.115, lr=0.0001] Steps: 71%|███████ | 496/700 [03:35<01:28, 2.31it/s, loss=0.0327, lr=0.0001] Steps: 71%|███████ | 497/700 [03:35<01:28, 2.30it/s, loss=0.0327, lr=0.0001] Steps: 71%|███████ | 497/700 [03:35<01:28, 2.30it/s, loss=0.14, lr=0.0001] Steps: 71%|███████ | 498/700 [03:36<01:27, 2.30it/s, loss=0.14, lr=0.0001] Steps: 71%|███████ | 498/700 [03:36<01:27, 2.30it/s, loss=0.0866, lr=0.0001] Steps: 71%|███████▏ | 499/700 [03:36<01:27, 2.31it/s, loss=0.0866, lr=0.0001] Steps: 71%|███████▏ | 499/700 [03:36<01:27, 2.31it/s, loss=0.132, lr=0.0001] Steps: 71%|███████▏ | 500/700 [03:37<01:26, 2.31it/s, loss=0.132, lr=0.0001] Steps: 71%|███████▏ | 500/700 [03:37<01:26, 2.31it/s, loss=0.119, lr=0.0001] Steps: 72%|███████▏ | 501/700 [03:37<01:26, 2.31it/s, loss=0.119, lr=0.0001] Steps: 72%|███████▏ | 501/700 [03:37<01:26, 2.31it/s, loss=0.129, lr=0.0001] Steps: 72%|███████▏ | 502/700 [03:37<01:25, 2.31it/s, loss=0.129, lr=0.0001] Steps: 72%|███████▏ | 502/700 [03:37<01:25, 2.31it/s, loss=0.128, lr=0.0001] Steps: 72%|███████▏ | 503/700 [03:38<01:25, 2.31it/s, loss=0.128, lr=0.0001] Steps: 72%|███████▏ | 503/700 [03:38<01:25, 2.31it/s, loss=0.121, lr=0.0001] Steps: 72%|███████▏ | 504/700 [03:38<01:24, 2.31it/s, loss=0.121, lr=0.0001] Steps: 72%|███████▏ | 504/700 [03:38<01:24, 2.31it/s, loss=0.134, lr=0.0001] Steps: 72%|███████▏ | 505/700 [03:39<01:24, 2.31it/s, loss=0.134, lr=0.0001] Steps: 72%|███████▏ | 505/700 [03:39<01:24, 2.31it/s, loss=0.108, lr=0.0001] Steps: 72%|███████▏ | 506/700 [03:39<01:24, 2.31it/s, loss=0.108, lr=0.0001] Steps: 72%|███████▏ | 506/700 [03:39<01:24, 2.31it/s, loss=0.06, lr=0.0001] Steps: 72%|███████▏ | 507/700 [03:40<01:23, 2.31it/s, loss=0.06, lr=0.0001] Steps: 72%|███████▏ | 507/700 [03:40<01:23, 2.31it/s, loss=0.144, lr=0.0001] Steps: 73%|███████▎ | 508/700 [03:40<01:23, 2.31it/s, loss=0.144, lr=0.0001] Steps: 73%|███████▎ | 508/700 [03:40<01:23, 2.31it/s, loss=0.0841, lr=0.0001] Steps: 73%|███████▎ | 509/700 [03:40<01:22, 2.31it/s, loss=0.0841, lr=0.0001] Steps: 73%|███████▎ | 509/700 [03:41<01:22, 2.31it/s, loss=0.104, lr=0.0001] Steps: 73%|███████▎ | 510/700 [03:41<01:22, 2.31it/s, loss=0.104, lr=0.0001] Steps: 73%|███████▎ | 510/700 [03:41<01:22, 2.31it/s, loss=0.0856, lr=0.0001] Steps: 73%|███████▎ | 511/700 [03:41<01:21, 2.31it/s, loss=0.0856, lr=0.0001] Steps: 73%|███████▎ | 511/700 [03:41<01:21, 2.31it/s, loss=0.16, lr=0.0001] Steps: 73%|███████▎ | 512/700 [03:42<01:21, 2.31it/s, loss=0.16, lr=0.0001] Steps: 73%|███████▎ | 512/700 [03:42<01:21, 2.31it/s, loss=0.0192, lr=0.0001] Steps: 73%|███████▎ | 513/700 [03:42<01:21, 2.30it/s, loss=0.0192, lr=0.0001] Steps: 73%|███████▎ | 513/700 [03:42<01:21, 2.30it/s, loss=0.0949, lr=0.0001] Steps: 73%|███████▎ | 514/700 [03:43<01:20, 2.30it/s, loss=0.0949, lr=0.0001] Steps: 73%|███████▎ | 514/700 [03:43<01:20, 2.30it/s, loss=0.223, lr=0.0001] Steps: 74%|███████▎ | 515/700 [03:43<01:20, 2.30it/s, loss=0.223, lr=0.0001] Steps: 74%|███████▎ | 515/700 [03:43<01:20, 2.30it/s, loss=0.164, lr=0.0001] Steps: 74%|███████▎ | 516/700 [03:44<01:19, 2.31it/s, loss=0.164, lr=0.0001] Steps: 74%|███████▎ | 516/700 [03:44<01:19, 2.31it/s, loss=0.0825, lr=0.0001] Steps: 74%|███████▍ | 517/700 [03:44<01:19, 2.31it/s, loss=0.0825, lr=0.0001] Steps: 74%|███████▍ | 517/700 [03:44<01:19, 2.31it/s, loss=0.133, lr=0.0001] Steps: 74%|███████▍ | 518/700 [03:44<01:18, 2.31it/s, loss=0.133, lr=0.0001] Steps: 74%|███████▍ | 518/700 [03:44<01:18, 2.31it/s, loss=0.0874, lr=0.0001] Steps: 74%|███████▍ | 519/700 [03:45<01:18, 2.31it/s, loss=0.0874, lr=0.0001] Steps: 74%|███████▍ | 519/700 [03:45<01:18, 2.31it/s, loss=0.162, lr=0.0001] Steps: 74%|███████▍ | 520/700 [03:45<01:18, 2.30it/s, loss=0.162, lr=0.0001] Steps: 74%|███████▍ | 520/700 [03:45<01:18, 2.30it/s, loss=0.102, lr=0.0001] Steps: 74%|███████▍ | 521/700 [03:46<01:17, 2.31it/s, loss=0.102, lr=0.0001] Steps: 74%|███████▍ | 521/700 [03:46<01:17, 2.31it/s, loss=0.145, lr=0.0001] Steps: 75%|███████▍ | 522/700 [03:46<01:17, 2.31it/s, loss=0.145, lr=0.0001] Steps: 75%|███████▍ | 522/700 [03:46<01:17, 2.31it/s, loss=0.0441, lr=0.0001] Steps: 75%|███████▍ | 523/700 [03:47<01:16, 2.31it/s, loss=0.0441, lr=0.0001] Steps: 75%|███████▍ | 523/700 [03:47<01:16, 2.31it/s, loss=0.119, lr=0.0001] Steps: 75%|███████▍ | 524/700 [03:47<01:16, 2.31it/s, loss=0.119, lr=0.0001] Steps: 75%|███████▍ | 524/700 [03:47<01:16, 2.31it/s, loss=0.0832, lr=0.0001] Steps: 75%|███████▌ | 525/700 [03:47<01:16, 2.30it/s, loss=0.0832, lr=0.0001] Steps: 75%|███████▌ | 525/700 [03:47<01:16, 2.30it/s, loss=0.136, lr=0.0001] Steps: 75%|███████▌ | 526/700 [03:48<01:15, 2.30it/s, loss=0.136, lr=0.0001] Steps: 75%|███████▌ | 526/700 [03:48<01:15, 2.30it/s, loss=0.124, lr=0.0001] Steps: 75%|███████▌ | 527/700 [03:48<01:15, 2.30it/s, loss=0.124, lr=0.0001] Steps: 75%|███████▌ | 527/700 [03:48<01:15, 2.30it/s, loss=0.0421, lr=0.0001] Steps: 75%|███████▌ | 528/700 [03:49<01:14, 2.31it/s, loss=0.0421, lr=0.0001] Steps: 75%|███████▌ | 528/700 [03:49<01:14, 2.31it/s, loss=0.0114, lr=0.0001] Steps: 76%|███████▌ | 529/700 [03:49<01:14, 2.30it/s, loss=0.0114, lr=0.0001] Steps: 76%|███████▌ | 529/700 [03:49<01:14, 2.30it/s, loss=0.134, lr=0.0001] Steps: 76%|███████▌ | 530/700 [03:50<01:13, 2.30it/s, loss=0.134, lr=0.0001] Steps: 76%|███████▌ | 530/700 [03:50<01:13, 2.30it/s, loss=0.0501, lr=0.0001] Steps: 76%|███████▌ | 531/700 [03:50<01:13, 2.30it/s, loss=0.0501, lr=0.0001] Steps: 76%|███████▌ | 531/700 [03:50<01:13, 2.30it/s, loss=0.0874, lr=0.0001] Steps: 76%|███████▌ | 532/700 [03:50<01:12, 2.31it/s, loss=0.0874, lr=0.0001] Steps: 76%|███████▌ | 532/700 [03:51<01:12, 2.31it/s, loss=0.0677, lr=0.0001] Steps: 76%|███████▌ | 533/700 [03:51<01:12, 2.31it/s, loss=0.0677, lr=0.0001] Steps: 76%|███████▌ | 533/700 [03:51<01:12, 2.31it/s, loss=0.299, lr=0.0001] Steps: 76%|███████▋ | 534/700 [03:51<01:12, 2.30it/s, loss=0.299, lr=0.0001] Steps: 76%|███████▋ | 534/700 [03:51<01:12, 2.30it/s, loss=0.12, lr=0.0001] Steps: 76%|███████▋ | 535/700 [03:52<01:11, 2.31it/s, loss=0.12, lr=0.0001] Steps: 76%|███████▋ | 535/700 [03:52<01:11, 2.31it/s, loss=0.279, lr=0.0001] Steps: 77%|███████▋ | 536/700 [03:52<01:11, 2.31it/s, loss=0.279, lr=0.0001] Steps: 77%|███████▋ | 536/700 [03:52<01:11, 2.31it/s, loss=0.109, lr=0.0001] Steps: 77%|███████▋ | 537/700 [03:53<01:10, 2.31it/s, loss=0.109, lr=0.0001] Steps: 77%|███████▋ | 537/700 [03:53<01:10, 2.31it/s, loss=0.0592, lr=0.0001] Steps: 77%|███████▋ | 538/700 [03:53<01:10, 2.31it/s, loss=0.0592, lr=0.0001] Steps: 77%|███████▋ | 538/700 [03:53<01:10, 2.31it/s, loss=0.101, lr=0.0001] Steps: 77%|███████▋ | 539/700 [03:54<01:09, 2.30it/s, loss=0.101, lr=0.0001] Steps: 77%|███████▋ | 539/700 [03:54<01:09, 2.30it/s, loss=0.0438, lr=0.0001] Steps: 77%|███████▋ | 540/700 [03:54<01:09, 2.30it/s, loss=0.0438, lr=0.0001] Steps: 77%|███████▋ | 540/700 [03:54<01:09, 2.30it/s, loss=0.101, lr=0.0001] Steps: 77%|███████▋ | 541/700 [03:54<01:09, 2.30it/s, loss=0.101, lr=0.0001] Steps: 77%|███████▋ | 541/700 [03:54<01:09, 2.30it/s, loss=0.139, lr=0.0001] Steps: 77%|███████▋ | 542/700 [03:55<01:08, 2.30it/s, loss=0.139, lr=0.0001] Steps: 77%|███████▋ | 542/700 [03:55<01:08, 2.30it/s, loss=0.198, lr=0.0001] Steps: 78%|███████▊ | 543/700 [03:55<01:08, 2.30it/s, loss=0.198, lr=0.0001] Steps: 78%|███████▊ | 543/700 [03:55<01:08, 2.30it/s, loss=0.171, lr=0.0001] Steps: 78%|███████▊ | 544/700 [03:56<01:07, 2.31it/s, loss=0.171, lr=0.0001] Steps: 78%|███████▊ | 544/700 [03:56<01:07, 2.31it/s, loss=0.11, lr=0.0001] Steps: 78%|███████▊ | 545/700 [03:56<01:07, 2.30it/s, loss=0.11, lr=0.0001] Steps: 78%|███████▊ | 545/700 [03:56<01:07, 2.30it/s, loss=0.117, lr=0.0001] Steps: 78%|███████▊ | 546/700 [03:57<01:06, 2.30it/s, loss=0.117, lr=0.0001] Steps: 78%|███████▊ | 546/700 [03:57<01:06, 2.30it/s, loss=0.0327, lr=0.0001] Steps: 78%|███████▊ | 547/700 [03:57<01:06, 2.30it/s, loss=0.0327, lr=0.0001] Steps: 78%|███████▊ | 547/700 [03:57<01:06, 2.30it/s, loss=0.0536, lr=0.0001] Steps: 78%|███████▊ | 548/700 [03:57<01:05, 2.31it/s, loss=0.0536, lr=0.0001] Steps: 78%|███████▊ | 548/700 [03:57<01:05, 2.31it/s, loss=0.1, lr=0.0001] Steps: 78%|███████▊ | 549/700 [03:58<01:05, 2.31it/s, loss=0.1, lr=0.0001] Steps: 78%|███████▊ | 549/700 [03:58<01:05, 2.31it/s, loss=0.113, lr=0.0001] Steps: 79%|███████▊ | 550/700 [03:58<01:04, 2.31it/s, loss=0.113, lr=0.0001] Steps: 79%|███████▊ | 550/700 [03:58<01:04, 2.31it/s, loss=0.0923, lr=0.0001] Steps: 79%|███████▊ | 551/700 [03:59<01:04, 2.31it/s, loss=0.0923, lr=0.0001] Steps: 79%|███████▊ | 551/700 [03:59<01:04, 2.31it/s, loss=0.13, lr=0.0001] Steps: 79%|███████▉ | 552/700 [03:59<01:04, 2.31it/s, loss=0.13, lr=0.0001] Steps: 79%|███████▉ | 552/700 [03:59<01:04, 2.31it/s, loss=0.0919, lr=0.0001] Steps: 79%|███████▉ | 553/700 [04:00<01:03, 2.31it/s, loss=0.0919, lr=0.0001] Steps: 79%|███████▉ | 553/700 [04:00<01:03, 2.31it/s, loss=0.125, lr=0.0001] Steps: 79%|███████▉ | 554/700 [04:00<01:03, 2.31it/s, loss=0.125, lr=0.0001] Steps: 79%|███████▉ | 554/700 [04:00<01:03, 2.31it/s, loss=0.0459, lr=0.0001] Steps: 79%|███████▉ | 555/700 [04:00<01:02, 2.31it/s, loss=0.0459, lr=0.0001] Steps: 79%|███████▉ | 555/700 [04:00<01:02, 2.31it/s, loss=0.178, lr=0.0001] Steps: 79%|███████▉ | 556/700 [04:01<01:02, 2.31it/s, loss=0.178, lr=0.0001] Steps: 79%|███████▉ | 556/700 [04:01<01:02, 2.31it/s, loss=0.0118, lr=0.0001] Steps: 80%|███████▉ | 557/700 [04:01<01:01, 2.31it/s, loss=0.0118, lr=0.0001] Steps: 80%|███████▉ | 557/700 [04:01<01:01, 2.31it/s, loss=0.105, lr=0.0001] Steps: 80%|███████▉ | 558/700 [04:02<01:01, 2.31it/s, loss=0.105, lr=0.0001] Steps: 80%|███████▉ | 558/700 [04:02<01:01, 2.31it/s, loss=0.141, lr=0.0001] Steps: 80%|███████▉ | 559/700 [04:02<01:01, 2.31it/s, loss=0.141, lr=0.0001] Steps: 80%|███████▉ | 559/700 [04:02<01:01, 2.31it/s, loss=0.135, lr=0.0001] Steps: 80%|████████ | 560/700 [04:03<01:00, 2.31it/s, loss=0.135, lr=0.0001] Steps: 80%|████████ | 560/700 [04:03<01:00, 2.31it/s, loss=0.118, lr=0.0001] Steps: 80%|████████ | 561/700 [04:03<01:00, 2.30it/s, loss=0.118, lr=0.0001] Steps: 80%|████████ | 561/700 [04:03<01:00, 2.30it/s, loss=0.162, lr=0.0001] Steps: 80%|████████ | 562/700 [04:03<00:59, 2.30it/s, loss=0.162, lr=0.0001] Steps: 80%|████████ | 562/700 [04:04<00:59, 2.30it/s, loss=0.0823, lr=0.0001] Steps: 80%|████████ | 563/700 [04:04<00:59, 2.30it/s, loss=0.0823, lr=0.0001] Steps: 80%|████████ | 563/700 [04:04<00:59, 2.30it/s, loss=0.182, lr=0.0001] Steps: 81%|████████ | 564/700 [04:04<00:59, 2.30it/s, loss=0.182, lr=0.0001] Steps: 81%|████████ | 564/700 [04:04<00:59, 2.30it/s, loss=0.118, lr=0.0001] Steps: 81%|████████ | 565/700 [04:05<00:58, 2.31it/s, loss=0.118, lr=0.0001] Steps: 81%|████████ | 565/700 [04:05<00:58, 2.31it/s, loss=0.0902, lr=0.0001] Steps: 81%|████████ | 566/700 [04:05<00:58, 2.31it/s, loss=0.0902, lr=0.0001] Steps: 81%|████████ | 566/700 [04:05<00:58, 2.31it/s, loss=0.0953, lr=0.0001] Steps: 81%|████████ | 567/700 [04:06<00:57, 2.31it/s, loss=0.0953, lr=0.0001] Steps: 81%|████████ | 567/700 [04:06<00:57, 2.31it/s, loss=0.126, lr=0.0001] Steps: 81%|████████ | 568/700 [04:06<00:57, 2.31it/s, loss=0.126, lr=0.0001] Steps: 81%|████████ | 568/700 [04:06<00:57, 2.31it/s, loss=0.0431, lr=0.0001] Steps: 81%|████████▏ | 569/700 [04:07<00:56, 2.31it/s, loss=0.0431, lr=0.0001] Steps: 81%|████████▏ | 569/700 [04:07<00:56, 2.31it/s, loss=0.0227, lr=0.0001] Steps: 81%|████████▏ | 570/700 [04:07<00:56, 2.31it/s, loss=0.0227, lr=0.0001] Steps: 81%|████████▏ | 570/700 [04:07<00:56, 2.31it/s, loss=0.192, lr=0.0001] Steps: 82%|████████▏ | 571/700 [04:07<00:55, 2.31it/s, loss=0.192, lr=0.0001] Steps: 82%|████████▏ | 571/700 [04:07<00:55, 2.31it/s, loss=0.189, lr=0.0001] Steps: 82%|████████▏ | 572/700 [04:08<00:55, 2.31it/s, loss=0.189, lr=0.0001] Steps: 82%|████████▏ | 572/700 [04:08<00:55, 2.31it/s, loss=0.116, lr=0.0001] Steps: 82%|████████▏ | 573/700 [04:08<00:55, 2.31it/s, loss=0.116, lr=0.0001] Steps: 82%|████████▏ | 573/700 [04:08<00:55, 2.31it/s, loss=0.156, lr=0.0001] Steps: 82%|████████▏ | 574/700 [04:09<00:54, 2.31it/s, loss=0.156, lr=0.0001] Steps: 82%|████████▏ | 574/700 [04:09<00:54, 2.31it/s, loss=0.133, lr=0.0001] Steps: 82%|████████▏ | 575/700 [04:09<00:54, 2.31it/s, loss=0.133, lr=0.0001] Steps: 82%|████████▏ | 575/700 [04:09<00:54, 2.31it/s, loss=0.0888, lr=0.0001] Steps: 82%|████████▏ | 576/700 [04:10<00:53, 2.31it/s, loss=0.0888, lr=0.0001] Steps: 82%|████████▏ | 576/700 [04:10<00:53, 2.31it/s, loss=0.128, lr=0.0001] Steps: 82%|████████▏ | 577/700 [04:10<00:53, 2.30it/s, loss=0.128, lr=0.0001] Steps: 82%|████████▏ | 577/700 [04:10<00:53, 2.30it/s, loss=0.154, lr=0.0001] Steps: 83%|████████▎ | 578/700 [04:10<00:53, 2.30it/s, loss=0.154, lr=0.0001] Steps: 83%|████████▎ | 578/700 [04:10<00:53, 2.30it/s, loss=0.062, lr=0.0001] Steps: 83%|████████▎ | 579/700 [04:11<00:52, 2.30it/s, loss=0.062, lr=0.0001] Steps: 83%|████████▎ | 579/700 [04:11<00:52, 2.30it/s, loss=0.11, lr=0.0001] Steps: 83%|████████▎ | 580/700 [04:11<00:52, 2.31it/s, loss=0.11, lr=0.0001] Steps: 83%|████████▎ | 580/700 [04:11<00:52, 2.31it/s, loss=0.0333, lr=0.0001] Steps: 83%|████████▎ | 581/700 [04:12<00:51, 2.31it/s, loss=0.0333, lr=0.0001] Steps: 83%|████████▎ | 581/700 [04:12<00:51, 2.31it/s, loss=0.0944, lr=0.0001] Steps: 83%|████████▎ | 582/700 [04:12<00:51, 2.31it/s, loss=0.0944, lr=0.0001] Steps: 83%|████████▎ | 582/700 [04:12<00:51, 2.31it/s, loss=0.106, lr=0.0001] Steps: 83%|████████▎ | 583/700 [04:13<00:50, 2.31it/s, loss=0.106, lr=0.0001] Steps: 83%|████████▎ | 583/700 [04:13<00:50, 2.31it/s, loss=0.125, lr=0.0001] Steps: 83%|████████▎ | 584/700 [04:13<00:50, 2.31it/s, loss=0.125, lr=0.0001] Steps: 83%|████████▎ | 584/700 [04:13<00:50, 2.31it/s, loss=0.0806, lr=0.0001] Steps: 84%|████████▎ | 585/700 [04:13<00:49, 2.31it/s, loss=0.0806, lr=0.0001] Steps: 84%|████████▎ | 585/700 [04:13<00:49, 2.31it/s, loss=0.157, lr=0.0001] Steps: 84%|████████▎ | 586/700 [04:14<00:49, 2.31it/s, loss=0.157, lr=0.0001] Steps: 84%|████████▎ | 586/700 [04:14<00:49, 2.31it/s, loss=0.0135, lr=0.0001] Steps: 84%|████████▍ | 587/700 [04:14<00:48, 2.31it/s, loss=0.0135, lr=0.0001] Steps: 84%|████████▍ | 587/700 [04:14<00:48, 2.31it/s, loss=0.244, lr=0.0001] Steps: 84%|████████▍ | 588/700 [04:15<00:48, 2.31it/s, loss=0.244, lr=0.0001] Steps: 84%|████████▍ | 588/700 [04:15<00:48, 2.31it/s, loss=0.148, lr=0.0001] Steps: 84%|████████▍ | 589/700 [04:15<00:48, 2.31it/s, loss=0.148, lr=0.0001] Steps: 84%|████████▍ | 589/700 [04:15<00:48, 2.31it/s, loss=0.118, lr=0.0001] Steps: 84%|████████▍ | 590/700 [04:16<00:47, 2.31it/s, loss=0.118, lr=0.0001] Steps: 84%|████████▍ | 590/700 [04:16<00:47, 2.31it/s, loss=0.128, lr=0.0001] Steps: 84%|████████▍ | 591/700 [04:16<00:47, 2.31it/s, loss=0.128, lr=0.0001] Steps: 84%|████████▍ | 591/700 [04:16<00:47, 2.31it/s, loss=0.148, lr=0.0001] Steps: 85%|████████▍ | 592/700 [04:16<00:46, 2.31it/s, loss=0.148, lr=0.0001] Steps: 85%|████████▍ | 592/700 [04:17<00:46, 2.31it/s, loss=0.278, lr=0.0001] Steps: 85%|████████▍ | 593/700 [04:17<00:46, 2.30it/s, loss=0.278, lr=0.0001] Steps: 85%|████████▍ | 593/700 [04:17<00:46, 2.30it/s, loss=0.134, lr=0.0001] Steps: 85%|████████▍ | 594/700 [04:17<00:46, 2.30it/s, loss=0.134, lr=0.0001] Steps: 85%|████████▍ | 594/700 [04:17<00:46, 2.30it/s, loss=0.0929, lr=0.0001] Steps: 85%|████████▌ | 595/700 [04:18<00:45, 2.30it/s, loss=0.0929, lr=0.0001] Steps: 85%|████████▌ | 595/700 [04:18<00:45, 2.30it/s, loss=0.102, lr=0.0001] Steps: 85%|████████▌ | 596/700 [04:18<00:45, 2.31it/s, loss=0.102, lr=0.0001] Steps: 85%|████████▌ | 596/700 [04:18<00:45, 2.31it/s, loss=0.0314, lr=0.0001] Steps: 85%|████████▌ | 597/700 [04:19<00:44, 2.31it/s, loss=0.0314, lr=0.0001] Steps: 85%|████████▌ | 597/700 [04:19<00:44, 2.31it/s, loss=0.15, lr=0.0001] Steps: 85%|████████▌ | 598/700 [04:19<00:44, 2.31it/s, loss=0.15, lr=0.0001] Steps: 85%|████████▌ | 598/700 [04:19<00:44, 2.31it/s, loss=0.104, lr=0.0001] Steps: 86%|████████▌ | 599/700 [04:20<00:43, 2.31it/s, loss=0.104, lr=0.0001] Steps: 86%|████████▌ | 599/700 [04:20<00:43, 2.31it/s, loss=0.0743, lr=0.0001] Steps: 86%|████████▌ | 600/700 [04:20<00:43, 2.31it/s, loss=0.0743, lr=0.0001] Steps: 86%|████████▌ | 600/700 [04:20<00:43, 2.31it/s, loss=0.128, lr=0.0001] Steps: 86%|████████▌ | 601/700 [04:20<00:42, 2.31it/s, loss=0.128, lr=0.0001] Steps: 86%|████████▌ | 601/700 [04:20<00:42, 2.31it/s, loss=0.123, lr=0.0001] Steps: 86%|████████▌ | 602/700 [04:21<00:42, 2.31it/s, loss=0.123, lr=0.0001] Steps: 86%|████████▌ | 602/700 [04:21<00:42, 2.31it/s, loss=0.111, lr=0.0001] Steps: 86%|████████▌ | 603/700 [04:21<00:41, 2.31it/s, loss=0.111, lr=0.0001] Steps: 86%|████████▌ | 603/700 [04:21<00:41, 2.31it/s, loss=0.071, lr=0.0001] Steps: 86%|████████▋ | 604/700 [04:22<00:41, 2.31it/s, loss=0.071, lr=0.0001] Steps: 86%|████████▋ | 604/700 [04:22<00:41, 2.31it/s, loss=0.255, lr=0.0001] Steps: 86%|████████▋ | 605/700 [04:22<00:41, 2.31it/s, loss=0.255, lr=0.0001] Steps: 86%|████████▋ | 605/700 [04:22<00:41, 2.31it/s, loss=0.069, lr=0.0001] Steps: 87%|████████▋ | 606/700 [04:23<00:40, 2.31it/s, loss=0.069, lr=0.0001] Steps: 87%|████████▋ | 606/700 [04:23<00:40, 2.31it/s, loss=0.127, lr=0.0001] Steps: 87%|████████▋ | 607/700 [04:23<00:40, 2.31it/s, loss=0.127, lr=0.0001] Steps: 87%|████████▋ | 607/700 [04:23<00:40, 2.31it/s, loss=0.176, lr=0.0001] Steps: 87%|████████▋ | 608/700 [04:23<00:39, 2.31it/s, loss=0.176, lr=0.0001] Steps: 87%|████████▋ | 608/700 [04:23<00:39, 2.31it/s, loss=0.131, lr=0.0001] Steps: 87%|████████▋ | 609/700 [04:24<00:39, 2.29it/s, loss=0.131, lr=0.0001] Steps: 87%|████████▋ | 609/700 [04:24<00:39, 2.29it/s, loss=0.265, lr=0.0001] Steps: 87%|████████▋ | 610/700 [04:24<00:39, 2.30it/s, loss=0.265, lr=0.0001] Steps: 87%|████████▋ | 610/700 [04:24<00:39, 2.30it/s, loss=0.19, lr=0.0001] Steps: 87%|████████▋ | 611/700 [04:25<00:38, 2.30it/s, loss=0.19, lr=0.0001] Steps: 87%|████████▋ | 611/700 [04:25<00:38, 2.30it/s, loss=0.143, lr=0.0001] Steps: 87%|████████▋ | 612/700 [04:25<00:38, 2.30it/s, loss=0.143, lr=0.0001] Steps: 87%|████████▋ | 612/700 [04:25<00:38, 2.30it/s, loss=0.11, lr=0.0001] Steps: 88%|████████▊ | 613/700 [04:26<00:37, 2.31it/s, loss=0.11, lr=0.0001] Steps: 88%|████████▊ | 613/700 [04:26<00:37, 2.31it/s, loss=0.327, lr=0.0001] Steps: 88%|████████▊ | 614/700 [04:26<00:37, 2.31it/s, loss=0.327, lr=0.0001] Steps: 88%|████████▊ | 614/700 [04:26<00:37, 2.31it/s, loss=0.127, lr=0.0001] Steps: 88%|████████▊ | 615/700 [04:26<00:36, 2.31it/s, loss=0.127, lr=0.0001] Steps: 88%|████████▊ | 615/700 [04:26<00:36, 2.31it/s, loss=0.0661, lr=0.0001] Steps: 88%|████████▊ | 616/700 [04:27<00:36, 2.31it/s, loss=0.0661, lr=0.0001] Steps: 88%|████████▊ | 616/700 [04:27<00:36, 2.31it/s, loss=0.0279, lr=0.0001] Steps: 88%|████████▊ | 617/700 [04:27<00:35, 2.31it/s, loss=0.0279, lr=0.0001] Steps: 88%|████████▊ | 617/700 [04:27<00:35, 2.31it/s, loss=0.0887, lr=0.0001] Steps: 88%|████████▊ | 618/700 [04:28<00:35, 2.31it/s, loss=0.0887, lr=0.0001] Steps: 88%|████████▊ | 618/700 [04:28<00:35, 2.31it/s, loss=0.222, lr=0.0001] Steps: 88%|████████▊ | 619/700 [04:28<00:35, 2.31it/s, loss=0.222, lr=0.0001] Steps: 88%|████████▊ | 619/700 [04:28<00:35, 2.31it/s, loss=0.253, lr=0.0001] Steps: 89%|████████▊ | 620/700 [04:29<00:34, 2.31it/s, loss=0.253, lr=0.0001] Steps: 89%|████████▊ | 620/700 [04:29<00:34, 2.31it/s, loss=0.0884, lr=0.0001] Steps: 89%|████████▊ | 621/700 [04:29<00:34, 2.30it/s, loss=0.0884, lr=0.0001] Steps: 89%|████████▊ | 621/700 [04:29<00:34, 2.30it/s, loss=0.0895, lr=0.0001] Steps: 89%|████████▉ | 622/700 [04:29<00:33, 2.31it/s, loss=0.0895, lr=0.0001] Steps: 89%|████████▉ | 622/700 [04:30<00:33, 2.31it/s, loss=0.113, lr=0.0001] Steps: 89%|████████▉ | 623/700 [04:30<00:33, 2.31it/s, loss=0.113, lr=0.0001] Steps: 89%|████████▉ | 623/700 [04:30<00:33, 2.31it/s, loss=0.0678, lr=0.0001] Steps: 89%|████████▉ | 624/700 [04:30<00:32, 2.31it/s, loss=0.0678, lr=0.0001] Steps: 89%|████████▉ | 624/700 [04:30<00:32, 2.31it/s, loss=0.147, lr=0.0001] Steps: 89%|████████▉ | 625/700 [04:31<00:32, 2.30it/s, loss=0.147, lr=0.0001] Steps: 89%|████████▉ | 625/700 [04:31<00:32, 2.30it/s, loss=0.087, lr=0.0001] Steps: 89%|████████▉ | 626/700 [04:31<00:32, 2.30it/s, loss=0.087, lr=0.0001] Steps: 89%|████████▉ | 626/700 [04:31<00:32, 2.30it/s, loss=0.0731, lr=0.0001] Steps: 90%|████████▉ | 627/700 [04:32<00:31, 2.30it/s, loss=0.0731, lr=0.0001] Steps: 90%|████████▉ | 627/700 [04:32<00:31, 2.30it/s, loss=0.137, lr=0.0001] Steps: 90%|████████▉ | 628/700 [04:32<00:31, 2.31it/s, loss=0.137, lr=0.0001] Steps: 90%|████████▉ | 628/700 [04:32<00:31, 2.31it/s, loss=0.117, lr=0.0001] Steps: 90%|████████▉ | 629/700 [04:33<00:30, 2.31it/s, loss=0.117, lr=0.0001] Steps: 90%|████████▉ | 629/700 [04:33<00:30, 2.31it/s, loss=0.102, lr=0.0001] Steps: 90%|█████████ | 630/700 [04:33<00:30, 2.31it/s, loss=0.102, lr=0.0001] Steps: 90%|█████████ | 630/700 [04:33<00:30, 2.31it/s, loss=0.276, lr=0.0001] Steps: 90%|█████████ | 631/700 [04:33<00:29, 2.31it/s, loss=0.276, lr=0.0001] Steps: 90%|█████████ | 631/700 [04:33<00:29, 2.31it/s, loss=0.12, lr=0.0001] Steps: 90%|█████████ | 632/700 [04:34<00:29, 2.31it/s, loss=0.12, lr=0.0001] Steps: 90%|█████████ | 632/700 [04:34<00:29, 2.31it/s, loss=0.171, lr=0.0001] Steps: 90%|█████████ | 633/700 [04:34<00:28, 2.31it/s, loss=0.171, lr=0.0001] Steps: 90%|█████████ | 633/700 [04:34<00:28, 2.31it/s, loss=0.0859, lr=0.0001] Steps: 91%|█████████ | 634/700 [04:35<00:28, 2.31it/s, loss=0.0859, lr=0.0001] Steps: 91%|█████████ | 634/700 [04:35<00:28, 2.31it/s, loss=0.0891, lr=0.0001] Steps: 91%|█████████ | 635/700 [04:35<00:28, 2.31it/s, loss=0.0891, lr=0.0001] Steps: 91%|█████████ | 635/700 [04:35<00:28, 2.31it/s, loss=0.122, lr=0.0001] Steps: 91%|█████████ | 636/700 [04:36<00:27, 2.31it/s, loss=0.122, lr=0.0001] Steps: 91%|█████████ | 636/700 [04:36<00:27, 2.31it/s, loss=0.147, lr=0.0001] Steps: 91%|█████████ | 637/700 [04:36<00:27, 2.31it/s, loss=0.147, lr=0.0001] Steps: 91%|█████████ | 637/700 [04:36<00:27, 2.31it/s, loss=0.103, lr=0.0001] Steps: 91%|█████████ | 638/700 [04:36<00:26, 2.31it/s, loss=0.103, lr=0.0001] Steps: 91%|█████████ | 638/700 [04:36<00:26, 2.31it/s, loss=0.212, lr=0.0001] Steps: 91%|█████████▏| 639/700 [04:37<00:26, 2.31it/s, loss=0.212, lr=0.0001] Steps: 91%|█████████▏| 639/700 [04:37<00:26, 2.31it/s, loss=0.125, lr=0.0001] Steps: 91%|█████████▏| 640/700 [04:37<00:25, 2.31it/s, loss=0.125, lr=0.0001] Steps: 91%|█████████▏| 640/700 [04:37<00:25, 2.31it/s, loss=0.222, lr=0.0001] Steps: 92%|█████████▏| 641/700 [04:38<00:25, 2.30it/s, loss=0.222, lr=0.0001] Steps: 92%|█████████▏| 641/700 [04:38<00:25, 2.30it/s, loss=0.145, lr=0.0001] Steps: 92%|█████████▏| 642/700 [04:38<00:25, 2.30it/s, loss=0.145, lr=0.0001] Steps: 92%|█████████▏| 642/700 [04:38<00:25, 2.30it/s, loss=0.0954, lr=0.0001] Steps: 92%|█████████▏| 643/700 [04:39<00:24, 2.31it/s, loss=0.0954, lr=0.0001] Steps: 92%|█████████▏| 643/700 [04:39<00:24, 2.31it/s, loss=0.288, lr=0.0001] Steps: 92%|█████████▏| 644/700 [04:39<00:24, 2.31it/s, loss=0.288, lr=0.0001] Steps: 92%|█████████▏| 644/700 [04:39<00:24, 2.31it/s, loss=0.115, lr=0.0001] Steps: 92%|█████████▏| 645/700 [04:39<00:23, 2.31it/s, loss=0.115, lr=0.0001] Steps: 92%|█████████▏| 645/700 [04:39<00:23, 2.31it/s, loss=0.111, lr=0.0001] Steps: 92%|█████████▏| 646/700 [04:40<00:23, 2.31it/s, loss=0.111, lr=0.0001] Steps: 92%|█████████▏| 646/700 [04:40<00:23, 2.31it/s, loss=0.111, lr=0.0001] Steps: 92%|█████████▏| 647/700 [04:40<00:22, 2.31it/s, loss=0.111, lr=0.0001] Steps: 92%|█████████▏| 647/700 [04:40<00:22, 2.31it/s, loss=0.16, lr=0.0001] Steps: 93%|█████████▎| 648/700 [04:41<00:22, 2.31it/s, loss=0.16, lr=0.0001] Steps: 93%|█████████▎| 648/700 [04:41<00:22, 2.31it/s, loss=0.08, lr=0.0001] Steps: 93%|█████████▎| 649/700 [04:41<00:22, 2.31it/s, loss=0.08, lr=0.0001] Steps: 93%|█████████▎| 649/700 [04:41<00:22, 2.31it/s, loss=0.145, lr=0.0001] Steps: 93%|█████████▎| 650/700 [04:42<00:21, 2.31it/s, loss=0.145, lr=0.0001] Steps: 93%|█████████▎| 650/700 [04:42<00:21, 2.31it/s, loss=0.105, lr=0.0001] Steps: 93%|█████████▎| 651/700 [04:42<00:21, 2.31it/s, loss=0.105, lr=0.0001] Steps: 93%|█████████▎| 651/700 [04:42<00:21, 2.31it/s, loss=0.142, lr=0.0001] Steps: 93%|█████████▎| 652/700 [04:42<00:20, 2.31it/s, loss=0.142, lr=0.0001] Steps: 93%|█████████▎| 652/700 [04:43<00:20, 2.31it/s, loss=0.177, lr=0.0001] Steps: 93%|█████████▎| 653/700 [04:43<00:20, 2.31it/s, loss=0.177, lr=0.0001] Steps: 93%|█████████▎| 653/700 [04:43<00:20, 2.31it/s, loss=0.0607, lr=0.0001] Steps: 93%|█████████▎| 654/700 [04:43<00:19, 2.31it/s, loss=0.0607, lr=0.0001] Steps: 93%|█████████▎| 654/700 [04:43<00:19, 2.31it/s, loss=0.131, lr=0.0001] Steps: 94%|█████████▎| 655/700 [04:44<00:19, 2.31it/s, loss=0.131, lr=0.0001] Steps: 94%|█████████▎| 655/700 [04:44<00:19, 2.31it/s, loss=0.0542, lr=0.0001] Steps: 94%|█████████▎| 656/700 [04:44<00:19, 2.31it/s, loss=0.0542, lr=0.0001] Steps: 94%|█████████▎| 656/700 [04:44<00:19, 2.31it/s, loss=0.113, lr=0.0001] Steps: 94%|█████████▍| 657/700 [04:45<00:18, 2.30it/s, loss=0.113, lr=0.0001] Steps: 94%|█████████▍| 657/700 [04:45<00:18, 2.30it/s, loss=0.173, lr=0.0001] Steps: 94%|█████████▍| 658/700 [04:45<00:18, 2.30it/s, loss=0.173, lr=0.0001] Steps: 94%|█████████▍| 658/700 [04:45<00:18, 2.30it/s, loss=0.0329, lr=0.0001] Steps: 94%|█████████▍| 659/700 [04:46<00:17, 2.31it/s, loss=0.0329, lr=0.0001] Steps: 94%|█████████▍| 659/700 [04:46<00:17, 2.31it/s, loss=0.161, lr=0.0001] Steps: 94%|█████████▍| 660/700 [04:46<00:17, 2.31it/s, loss=0.161, lr=0.0001] Steps: 94%|█████████▍| 660/700 [04:46<00:17, 2.31it/s, loss=0.0519, lr=0.0001] Steps: 94%|█████████▍| 661/700 [04:46<00:16, 2.31it/s, loss=0.0519, lr=0.0001] Steps: 94%|█████████▍| 661/700 [04:46<00:16, 2.31it/s, loss=0.0884, lr=0.0001] Steps: 95%|█████████▍| 662/700 [04:47<00:16, 2.31it/s, loss=0.0884, lr=0.0001] Steps: 95%|█████████▍| 662/700 [04:47<00:16, 2.31it/s, loss=0.108, lr=0.0001] Steps: 95%|█████████▍| 663/700 [04:47<00:16, 2.31it/s, loss=0.108, lr=0.0001] Steps: 95%|█████████▍| 663/700 [04:47<00:16, 2.31it/s, loss=0.0557, lr=0.0001] Steps: 95%|█████████▍| 664/700 [04:48<00:15, 2.31it/s, loss=0.0557, lr=0.0001] Steps: 95%|█████████▍| 664/700 [04:48<00:15, 2.31it/s, loss=0.12, lr=0.0001] Steps: 95%|█████████▌| 665/700 [04:48<00:15, 2.31it/s, loss=0.12, lr=0.0001] Steps: 95%|█████████▌| 665/700 [04:48<00:15, 2.31it/s, loss=0.0976, lr=0.0001] Steps: 95%|█████████▌| 666/700 [04:49<00:14, 2.31it/s, loss=0.0976, lr=0.0001] Steps: 95%|█████████▌| 666/700 [04:49<00:14, 2.31it/s, loss=0.175, lr=0.0001] Steps: 95%|█████████▌| 667/700 [04:49<00:14, 2.31it/s, loss=0.175, lr=0.0001] Steps: 95%|█████████▌| 667/700 [04:49<00:14, 2.31it/s, loss=0.0758, lr=0.0001] Steps: 95%|█████████▌| 668/700 [04:49<00:13, 2.31it/s, loss=0.0758, lr=0.0001] Steps: 95%|█████████▌| 668/700 [04:49<00:13, 2.31it/s, loss=0.154, lr=0.0001] Steps: 96%|█████████▌| 669/700 [04:50<00:13, 2.31it/s, loss=0.154, lr=0.0001] Steps: 96%|█████████▌| 669/700 [04:50<00:13, 2.31it/s, loss=0.0661, lr=0.0001] Steps: 96%|█████████▌| 670/700 [04:50<00:12, 2.31it/s, loss=0.0661, lr=0.0001] Steps: 96%|█████████▌| 670/700 [04:50<00:12, 2.31it/s, loss=0.222, lr=0.0001] Steps: 96%|█████████▌| 671/700 [04:51<00:12, 2.31it/s, loss=0.222, lr=0.0001] Steps: 96%|█████████▌| 671/700 [04:51<00:12, 2.31it/s, loss=0.125, lr=0.0001] Steps: 96%|█████████▌| 672/700 [04:51<00:12, 2.31it/s, loss=0.125, lr=0.0001] Steps: 96%|█████████▌| 672/700 [04:51<00:12, 2.31it/s, loss=0.117, lr=0.0001] Steps: 96%|█████████▌| 673/700 [04:52<00:11, 2.30it/s, loss=0.117, lr=0.0001] Steps: 96%|█████████▌| 673/700 [04:52<00:11, 2.30it/s, loss=0.163, lr=0.0001] Steps: 96%|█████████▋| 674/700 [04:52<00:11, 2.30it/s, loss=0.163, lr=0.0001] Steps: 96%|█████████▋| 674/700 [04:52<00:11, 2.30it/s, loss=0.0756, lr=0.0001] Steps: 96%|█████████▋| 675/700 [04:52<00:10, 2.31it/s, loss=0.0756, lr=0.0001] Steps: 96%|█████████▋| 675/700 [04:52<00:10, 2.31it/s, loss=0.178, lr=0.0001] Steps: 97%|█████████▋| 676/700 [04:53<00:10, 2.31it/s, loss=0.178, lr=0.0001] Steps: 97%|█████████▋| 676/700 [04:53<00:10, 2.31it/s, loss=0.104, lr=0.0001] Steps: 97%|█████████▋| 677/700 [04:53<00:09, 2.31it/s, loss=0.104, lr=0.0001] Steps: 97%|█████████▋| 677/700 [04:53<00:09, 2.31it/s, loss=0.139, lr=0.0001] Steps: 97%|█████████▋| 678/700 [04:54<00:09, 2.31it/s, loss=0.139, lr=0.0001] Steps: 97%|█████████▋| 678/700 [04:54<00:09, 2.31it/s, loss=0.0792, lr=0.0001] Steps: 97%|█████████▋| 679/700 [04:54<00:09, 2.31it/s, loss=0.0792, lr=0.0001] Steps: 97%|█████████▋| 679/700 [04:54<00:09, 2.31it/s, loss=0.214, lr=0.0001] Steps: 97%|█████████▋| 680/700 [04:55<00:08, 2.31it/s, loss=0.214, lr=0.0001] Steps: 97%|█████████▋| 680/700 [04:55<00:08, 2.31it/s, loss=0.105, lr=0.0001] Steps: 97%|█████████▋| 681/700 [04:55<00:08, 2.31it/s, loss=0.105, lr=0.0001] Steps: 97%|█████████▋| 681/700 [04:55<00:08, 2.31it/s, loss=0.233, lr=0.0001] Steps: 97%|█████████▋| 682/700 [04:55<00:07, 2.31it/s, loss=0.233, lr=0.0001] Steps: 97%|█████████▋| 682/700 [04:56<00:07, 2.31it/s, loss=0.107, lr=0.0001] Steps: 98%|█████████▊| 683/700 [04:56<00:07, 2.31it/s, loss=0.107, lr=0.0001] Steps: 98%|█████████▊| 683/700 [04:56<00:07, 2.31it/s, loss=0.125, lr=0.0001] Steps: 98%|█████████▊| 684/700 [04:56<00:06, 2.31it/s, loss=0.125, lr=0.0001] Steps: 98%|█████████▊| 684/700 [04:56<00:06, 2.31it/s, loss=0.176, lr=0.0001] Steps: 98%|█████████▊| 685/700 [04:57<00:06, 2.31it/s, loss=0.176, lr=0.0001] Steps: 98%|█████████▊| 685/700 [04:57<00:06, 2.31it/s, loss=0.0955, lr=0.0001] Steps: 98%|█████████▊| 686/700 [04:57<00:06, 2.31it/s, loss=0.0955, lr=0.0001] Steps: 98%|█████████▊| 686/700 [04:57<00:06, 2.31it/s, loss=0.11, lr=0.0001] Steps: 98%|█████████▊| 687/700 [04:58<00:05, 2.31it/s, loss=0.11, lr=0.0001] Steps: 98%|█████████▊| 687/700 [04:58<00:05, 2.31it/s, loss=0.139, lr=0.0001] Steps: 98%|█████████▊| 688/700 [04:58<00:05, 2.31it/s, loss=0.139, lr=0.0001] Steps: 98%|█████████▊| 688/700 [04:58<00:05, 2.31it/s, loss=0.0515, lr=0.0001] Steps: 98%|█████████▊| 689/700 [04:59<00:04, 2.30it/s, loss=0.0515, lr=0.0001] Steps: 98%|█████████▊| 689/700 [04:59<00:04, 2.30it/s, loss=0.102, lr=0.0001] Steps: 99%|█████████▊| 690/700 [04:59<00:04, 2.30it/s, loss=0.102, lr=0.0001] Steps: 99%|█████████▊| 690/700 [04:59<00:04, 2.30it/s, loss=0.174, lr=0.0001] Steps: 99%|█████████▊| 691/700 [04:59<00:03, 2.31it/s, loss=0.174, lr=0.0001] Steps: 99%|█████████▊| 691/700 [04:59<00:03, 2.31it/s, loss=0.161, lr=0.0001] Steps: 99%|█████████▉| 692/700 [05:00<00:03, 2.31it/s, loss=0.161, lr=0.0001] Steps: 99%|█████████▉| 692/700 [05:00<00:03, 2.31it/s, loss=0.103, lr=0.0001] Steps: 99%|█████████▉| 693/700 [05:00<00:03, 2.31it/s, loss=0.103, lr=0.0001] Steps: 99%|█████████▉| 693/700 [05:00<00:03, 2.31it/s, loss=0.0503, lr=0.0001] Steps: 99%|█████████▉| 694/700 [05:01<00:02, 2.31it/s, loss=0.0503, lr=0.0001] Steps: 99%|█████████▉| 694/700 [05:01<00:02, 2.31it/s, loss=0.079, lr=0.0001] Steps: 99%|█████████▉| 695/700 [05:01<00:02, 2.31it/s, loss=0.079, lr=0.0001] Steps: 99%|█████████▉| 695/700 [05:01<00:02, 2.31it/s, loss=0.0907, lr=0.0001] Steps: 99%|█████████▉| 696/700 [05:02<00:01, 2.31it/s, loss=0.0907, lr=0.0001] Steps: 99%|█████████▉| 696/700 [05:02<00:01, 2.31it/s, loss=0.108, lr=0.0001] Steps: 100%|█████████▉| 697/700 [05:02<00:01, 2.31it/s, loss=0.108, lr=0.0001] Steps: 100%|█████████▉| 697/700 [05:02<00:01, 2.31it/s, loss=0.165, lr=0.0001] Steps: 100%|█████████▉| 698/700 [05:02<00:00, 2.31it/s, loss=0.165, lr=0.0001] Steps: 100%|█████████▉| 698/700 [05:02<00:00, 2.31it/s, loss=0.194, lr=0.0001] Steps: 100%|█████████▉| 699/700 [05:03<00:00, 2.31it/s, loss=0.194, lr=0.0001] Steps: 100%|█████████▉| 699/700 [05:03<00:00, 2.31it/s, loss=0.229, lr=0.0001] Steps: 100%|██████████| 700/700 [05:03<00:00, 2.31it/s, loss=0.229, lr=0.0001] Steps: 100%|██████████| 700/700 [05:03<00:00, 2.31it/s, loss=0.141, lr=0.0001]Model weights saved in /tmp/train/output/sd35_large_train_replicate/pytorch_lora_weights.safetensors Loading pipeline components...: 0%| | 0/9 [00:00<?, ?it/s]{'base_image_seq_len', 'base_shift', 'max_shift', 'max_image_seq_len', 'use_dynamic_shifting'} was not found in config. Values will be initialized to default values. Loaded scheduler as FlowMatchEulerDiscreteScheduler from `scheduler` subfolder of stable-diffusion-3.5-large. Loaded text_encoder as CLIPTextModelWithProjection from `text_encoder` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 22%|██▏ | 2/9 [00:00<00:01, 5.30it/s] Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|█████ | 1/2 [00:04<00:04, 4.98s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.75s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.79s/it] Loaded text_encoder_3 as T5EncoderModel from `text_encoder_3` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 33%|███▎ | 3/9 [00:09<00:24, 4.12s/it]{'dual_attention_layers'} was not found in config. Values will be initialized to default values. Loaded transformer as SD3Transformer2DModel from `transformer` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 44%|████▍ | 4/9 [00:11<00:16, 3.27s/it]Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of stable-diffusion-3.5-large. Loaded tokenizer_3 as T5TokenizerFast from `tokenizer_3` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 67%|██████▋ | 6/9 [00:12<00:04, 1.64s/it]Loaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of stable-diffusion-3.5-large. Loaded text_encoder_2 as CLIPTextModelWithProjection from `text_encoder_2` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 89%|████████▉ | 8/9 [00:13<00:01, 1.30s/it]Loaded vae as AutoencoderKL from `vae` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 100%|██████████| 9/9 [00:13<00:00, 1.53s/it] Steps: 100%|██████████| 700/700 [05:18<00:00, 2.20it/s, loss=0.141, lr=0.0001] ./ ./output/ ./output/sd35_large_train_replicate/ ./output/sd35_large_train_replicate/lora.safetensors
Prediction
lucataco/stable-diffusion-3.5-large-lora-trainer:c60e90e9737b8c6b9775e2bc3c167d62da8a7bd7000c9244572b1b5193f3c27aID2vm6qf6cfxrj20cjrxcavwtwb0StatusSucceededSourceWebHardwareA100 (80GB)Total durationCreatedby @lucatacoInput
- rank
- 16
- backend
- no
- hf_token
- ████████████████████
This value was redacted after being sent to the model.
- optimizer
- AdamW
- resolution
- 768
- hub_model_id
- lucataco/SD3.5-Large-queso
- input_images
- queso2.zip
- lr_scheduler
- constant
- learning_rate
- 0.0001
- instance_prompt
- a photo of QSO dog
- max_train_steps
- 700
- train_batch_size
- 1
- gradient_accumulation_steps
- 1
{ "rank": 16, "backend": "no", "hf_token": "[REDACTED]", "optimizer": "AdamW", "resolution": 768, "hub_model_id": "lucataco/SD3.5-Large-queso", "input_images": "https://replicate.delivery/pbxt/LlvEhC2BWWxpMKYYzCHHf2TSdJdZ15on8LEkbLY6evEC9bVd/queso2.zip", "lr_scheduler": "constant", "learning_rate": 0.0001, "instance_prompt": "a photo of QSO dog", "max_train_steps": 700, "train_batch_size": 1, "gradient_accumulation_steps": 1 }
Install Replicate’s Node.js client library:npm install replicate
Import and set up the client:import Replicate from "replicate"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN, });
Run lucataco/stable-diffusion-3.5-large-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const output = await replicate.run( "lucataco/stable-diffusion-3.5-large-lora-trainer:c60e90e9737b8c6b9775e2bc3c167d62da8a7bd7000c9244572b1b5193f3c27a", { input: { rank: 16, backend: "no", hf_token: "[REDACTED]", optimizer: "AdamW", resolution: 768, hub_model_id: "lucataco/SD3.5-Large-queso", input_images: "https://replicate.delivery/pbxt/LlvEhC2BWWxpMKYYzCHHf2TSdJdZ15on8LEkbLY6evEC9bVd/queso2.zip", lr_scheduler: "constant", learning_rate: 0.0001, instance_prompt: "a photo of QSO dog", max_train_steps: 700, train_batch_size: 1, gradient_accumulation_steps: 1 } } ); // To access the file URL: console.log(output.url()); //=> "http://example.com" // To write the file to disk: fs.writeFile("my-image.png", output);
To learn more, take a look at the guide on getting started with Node.js.
Install Replicate’s Python client library:pip install replicate
Import the client:import replicate
Run lucataco/stable-diffusion-3.5-large-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run( "lucataco/stable-diffusion-3.5-large-lora-trainer:c60e90e9737b8c6b9775e2bc3c167d62da8a7bd7000c9244572b1b5193f3c27a", input={ "rank": 16, "backend": "no", "hf_token": "[REDACTED]", "optimizer": "AdamW", "resolution": 768, "hub_model_id": "lucataco/SD3.5-Large-queso", "input_images": "https://replicate.delivery/pbxt/LlvEhC2BWWxpMKYYzCHHf2TSdJdZ15on8LEkbLY6evEC9bVd/queso2.zip", "lr_scheduler": "constant", "learning_rate": 0.0001, "instance_prompt": "a photo of QSO dog", "max_train_steps": 700, "train_batch_size": 1, "gradient_accumulation_steps": 1 } ) print(output)
To learn more, take a look at the guide on getting started with Python.
Run lucataco/stable-diffusion-3.5-large-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H "Content-Type: application/json" \ -H "Prefer: wait" \ -d $'{ "version": "c60e90e9737b8c6b9775e2bc3c167d62da8a7bd7000c9244572b1b5193f3c27a", "input": { "rank": 16, "backend": "no", "hf_token": "[REDACTED]", "optimizer": "AdamW", "resolution": 768, "hub_model_id": "lucataco/SD3.5-Large-queso", "input_images": "https://replicate.delivery/pbxt/LlvEhC2BWWxpMKYYzCHHf2TSdJdZ15on8LEkbLY6evEC9bVd/queso2.zip", "lr_scheduler": "constant", "learning_rate": 0.0001, "instance_prompt": "a photo of QSO dog", "max_train_steps": 700, "train_batch_size": 1, "gradient_accumulation_steps": 1 } }' \ https://api.replicate.com/v1/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
Output
{ "completed_at": "2024-10-26T05:49:33.854994Z", "created_at": "2024-10-26T05:42:18.495000Z", "data_removed": false, "error": null, "id": "2vm6qf6cfxrj20cjrxcavwtwb0", "input": { "rank": 16, "backend": "no", "hf_token": "[REDACTED]", "optimizer": "AdamW", "resolution": 768, "hub_model_id": "lucataco/SD3.5-Large-queso", "input_images": "https://replicate.delivery/pbxt/LlvEhC2BWWxpMKYYzCHHf2TSdJdZ15on8LEkbLY6evEC9bVd/queso2.zip", "lr_scheduler": "constant", "learning_rate": 0.0001, "instance_prompt": "a photo of QSO dog", "max_train_steps": 700, "train_batch_size": 1, "gradient_accumulation_steps": 1 }, "logs": "Using seed: 3962600180\nExtracted 12 files from zip to input_images\nThe token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.\nToken is valid (permission: write).\nYour token has been saved to /root/.cache/huggingface/token\nLogin successful\nUsing params: ['accelerate', 'launch', '--dynamo_backend', 'no', 'train_dreambooth_lora_sd3.py', '--pretrained_model_name_or_path', 'stable-diffusion-3.5-large', '--instance_data_dir', 'input_images', '--rank', '16', '--output_dir', '/tmp/train/output/sd35_large_train_replicate', '--mixed_precision', 'bf16', '--instance_prompt', 'a photo of QSO dog', '--resolution', '768', '--train_batch_size', '1', '--gradient_accumulation_steps', '1', '--optimizer', 'AdamW', '--learning_rate', '0.0001', '--lr_scheduler', 'constant', '--lr_warmup_steps', '0', '--max_train_steps', '700', '--checkpointing_steps', '701', '--seed', '3962600180', '--logging_dir', '/tmp/logs', '--push_to_hub', '--hub_token', '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', '--hub_model_id', 'lucataco/SD3.5-Large-queso']\n10/26/2024 05:43:25 - INFO - __main__ - Distributed environment: DistributedType.NO\nNum processes: 1\nProcess index: 0\nLocal process index: 0\nDevice: cuda\nMixed precision type: bf16\nYou set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers\nYou are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.\nYou are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.\nYou are using a model of type t5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.\n{'use_dynamic_shifting', 'max_shift', 'base_shift', 'base_image_seq_len', 'max_image_seq_len'} was not found in config. Values will be initialized to default values.\nLoading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]\nLoading checkpoint shards: 50%|█████ | 1/2 [00:03<00:03, 3.70s/it]\nLoading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00, 3.55s/it]\nLoading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00, 3.57s/it]\n{'dual_attention_layers'} was not found in config. Values will be initialized to default values.\n10/26/2024 05:44:12 - INFO - __main__ - ***** Running training *****\n10/26/2024 05:44:12 - INFO - __main__ - Num examples = 12\n10/26/2024 05:44:12 - INFO - __main__ - Num batches each epoch = 12\n10/26/2024 05:44:12 - INFO - __main__ - Num Epochs = 59\n10/26/2024 05:44:12 - INFO - __main__ - Instantaneous batch size per device = 1\n10/26/2024 05:44:12 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 1\n10/26/2024 05:44:12 - INFO - __main__ - Gradient Accumulation steps = 1\n10/26/2024 05:44:12 - INFO - __main__ - Total optimization steps = 700\nSteps: 0%| | 0/700 [00:00<?, ?it/s]\nSteps: 0%| | 1/700 [00:00<07:39, 1.52it/s]\nSteps: 0%| | 1/700 [00:00<07:39, 1.52it/s, loss=0.139, lr=0.0001]\nSteps: 0%| | 2/700 [00:01<05:55, 1.96it/s, loss=0.139, lr=0.0001]\nSteps: 0%| | 2/700 [00:01<05:55, 1.96it/s, loss=0.0678, lr=0.0001]\nSteps: 0%| | 3/700 [00:01<05:30, 2.11it/s, loss=0.0678, lr=0.0001]\nSteps: 0%| | 3/700 [00:01<05:30, 2.11it/s, loss=0.04, lr=0.0001] \nSteps: 1%| | 4/700 [00:01<05:18, 2.19it/s, loss=0.04, lr=0.0001]\nSteps: 1%| | 4/700 [00:01<05:18, 2.19it/s, loss=0.0636, lr=0.0001]\nSteps: 1%| | 5/700 [00:02<05:11, 2.23it/s, loss=0.0636, lr=0.0001]\nSteps: 1%| | 5/700 [00:02<05:11, 2.23it/s, loss=0.0621, lr=0.0001]\nSteps: 1%| | 6/700 [00:02<05:07, 2.26it/s, loss=0.0621, lr=0.0001]\nSteps: 1%| | 6/700 [00:02<05:07, 2.26it/s, loss=0.152, lr=0.0001] \nSteps: 1%| | 7/700 [00:03<05:04, 2.27it/s, loss=0.152, lr=0.0001]\nSteps: 1%| | 7/700 [00:03<05:04, 2.27it/s, loss=0.204, lr=0.0001]\nSteps: 1%| | 8/700 [00:03<05:02, 2.29it/s, loss=0.204, lr=0.0001]\nSteps: 1%| | 8/700 [00:03<05:02, 2.29it/s, loss=0.105, lr=0.0001]\nSteps: 1%|▏ | 9/700 [00:04<05:01, 2.29it/s, loss=0.105, lr=0.0001]\nSteps: 1%|▏ | 9/700 [00:04<05:01, 2.29it/s, loss=0.111, lr=0.0001]\nSteps: 1%|▏ | 10/700 [00:04<05:00, 2.30it/s, loss=0.111, lr=0.0001]\nSteps: 1%|▏ | 10/700 [00:04<05:00, 2.30it/s, loss=0.111, lr=0.0001]\nSteps: 2%|▏ | 11/700 [00:04<04:59, 2.30it/s, loss=0.111, lr=0.0001]\nSteps: 2%|▏ | 11/700 [00:04<04:59, 2.30it/s, loss=0.161, lr=0.0001]\nSteps: 2%|▏ | 12/700 [00:05<04:58, 2.31it/s, loss=0.161, lr=0.0001]\nSteps: 2%|▏ | 12/700 [00:05<04:58, 2.31it/s, loss=0.0618, lr=0.0001]\nSteps: 2%|▏ | 13/700 [00:05<04:58, 2.30it/s, loss=0.0618, lr=0.0001]\nSteps: 2%|▏ | 13/700 [00:05<04:58, 2.30it/s, loss=0.0331, lr=0.0001]\nSteps: 2%|▏ | 14/700 [00:06<04:57, 2.30it/s, loss=0.0331, lr=0.0001]\nSteps: 2%|▏ | 14/700 [00:06<04:57, 2.30it/s, loss=0.0863, lr=0.0001]\nSteps: 2%|▏ | 15/700 [00:06<04:56, 2.31it/s, loss=0.0863, lr=0.0001]\nSteps: 2%|▏ | 15/700 [00:06<04:56, 2.31it/s, loss=0.133, lr=0.0001] \nSteps: 2%|▏ | 16/700 [00:07<04:56, 2.31it/s, loss=0.133, lr=0.0001]\nSteps: 2%|▏ | 16/700 [00:07<04:56, 2.31it/s, loss=0.228, lr=0.0001]\nSteps: 2%|▏ | 17/700 [00:07<04:55, 2.31it/s, loss=0.228, lr=0.0001]\nSteps: 2%|▏ | 17/700 [00:07<04:55, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 3%|▎ | 18/700 [00:07<04:55, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 3%|▎ | 18/700 [00:08<04:55, 2.31it/s, loss=0.146, lr=0.0001]\nSteps: 3%|▎ | 19/700 [00:08<04:55, 2.31it/s, loss=0.146, lr=0.0001]\nSteps: 3%|▎ | 19/700 [00:08<04:55, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 3%|▎ | 20/700 [00:08<04:54, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 3%|▎ | 20/700 [00:08<04:54, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 3%|▎ | 21/700 [00:09<04:54, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 3%|▎ | 21/700 [00:09<04:54, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 3%|▎ | 22/700 [00:09<04:53, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 3%|▎ | 22/700 [00:09<04:53, 2.31it/s, loss=0.1, lr=0.0001] \nSteps: 3%|▎ | 23/700 [00:10<04:53, 2.31it/s, loss=0.1, lr=0.0001]\nSteps: 3%|▎ | 23/700 [00:10<04:53, 2.31it/s, loss=0.112, lr=0.0001]\nSteps: 3%|▎ | 24/700 [00:10<04:52, 2.31it/s, loss=0.112, lr=0.0001]\nSteps: 3%|▎ | 24/700 [00:10<04:52, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 4%|▎ | 25/700 [00:11<04:53, 2.30it/s, loss=0.105, lr=0.0001]\nSteps: 4%|▎ | 25/700 [00:11<04:53, 2.30it/s, loss=0.0693, lr=0.0001]\nSteps: 4%|▎ | 26/700 [00:11<04:52, 2.30it/s, loss=0.0693, lr=0.0001]\nSteps: 4%|▎ | 26/700 [00:11<04:52, 2.30it/s, loss=0.152, lr=0.0001] \nSteps: 4%|▍ | 27/700 [00:11<04:51, 2.31it/s, loss=0.152, lr=0.0001]\nSteps: 4%|▍ | 27/700 [00:11<04:51, 2.31it/s, loss=0.0932, lr=0.0001]\nSteps: 4%|▍ | 28/700 [00:12<04:51, 2.31it/s, loss=0.0932, lr=0.0001]\nSteps: 4%|▍ | 28/700 [00:12<04:51, 2.31it/s, loss=0.193, lr=0.0001] \nSteps: 4%|▍ | 29/700 [00:12<04:50, 2.31it/s, loss=0.193, lr=0.0001]\nSteps: 4%|▍ | 29/700 [00:12<04:50, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 4%|▍ | 30/700 [00:13<04:49, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 4%|▍ | 30/700 [00:13<04:49, 2.31it/s, loss=0.0597, lr=0.0001]\nSteps: 4%|▍ | 31/700 [00:13<04:49, 2.31it/s, loss=0.0597, lr=0.0001]\nSteps: 4%|▍ | 31/700 [00:13<04:49, 2.31it/s, loss=0.0413, lr=0.0001]\nSteps: 5%|▍ | 32/700 [00:14<04:48, 2.31it/s, loss=0.0413, lr=0.0001]\nSteps: 5%|▍ | 32/700 [00:14<04:48, 2.31it/s, loss=0.0705, lr=0.0001]\nSteps: 5%|▍ | 33/700 [00:14<04:48, 2.31it/s, loss=0.0705, lr=0.0001]\nSteps: 5%|▍ | 33/700 [00:14<04:48, 2.31it/s, loss=0.0996, lr=0.0001]\nSteps: 5%|▍ | 34/700 [00:14<04:48, 2.31it/s, loss=0.0996, lr=0.0001]\nSteps: 5%|▍ | 34/700 [00:14<04:48, 2.31it/s, loss=0.0392, lr=0.0001]\nSteps: 5%|▌ | 35/700 [00:15<04:47, 2.31it/s, loss=0.0392, lr=0.0001]\nSteps: 5%|▌ | 35/700 [00:15<04:47, 2.31it/s, loss=0.0874, lr=0.0001]\nSteps: 5%|▌ | 36/700 [00:15<04:46, 2.31it/s, loss=0.0874, lr=0.0001]\nSteps: 5%|▌ | 36/700 [00:15<04:46, 2.31it/s, loss=0.127, lr=0.0001] \nSteps: 5%|▌ | 37/700 [00:16<04:47, 2.30it/s, loss=0.127, lr=0.0001]\nSteps: 5%|▌ | 37/700 [00:16<04:47, 2.30it/s, loss=0.195, lr=0.0001]\nSteps: 5%|▌ | 38/700 [00:16<04:46, 2.31it/s, loss=0.195, lr=0.0001]\nSteps: 5%|▌ | 38/700 [00:16<04:46, 2.31it/s, loss=0.0707, lr=0.0001]\nSteps: 6%|▌ | 39/700 [00:17<04:46, 2.31it/s, loss=0.0707, lr=0.0001]\nSteps: 6%|▌ | 39/700 [00:17<04:46, 2.31it/s, loss=0.0302, lr=0.0001]\nSteps: 6%|▌ | 40/700 [00:17<04:45, 2.31it/s, loss=0.0302, lr=0.0001]\nSteps: 6%|▌ | 40/700 [00:17<04:45, 2.31it/s, loss=0.0603, lr=0.0001]\nSteps: 6%|▌ | 41/700 [00:17<04:45, 2.31it/s, loss=0.0603, lr=0.0001]\nSteps: 6%|▌ | 41/700 [00:17<04:45, 2.31it/s, loss=0.119, lr=0.0001] \nSteps: 6%|▌ | 42/700 [00:18<04:44, 2.31it/s, loss=0.119, lr=0.0001]\nSteps: 6%|▌ | 42/700 [00:18<04:44, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 6%|▌ | 43/700 [00:18<04:44, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 6%|▌ | 43/700 [00:18<04:44, 2.31it/s, loss=0.0303, lr=0.0001]\nSteps: 6%|▋ | 44/700 [00:19<04:43, 2.31it/s, loss=0.0303, lr=0.0001]\nSteps: 6%|▋ | 44/700 [00:19<04:43, 2.31it/s, loss=0.152, lr=0.0001] \nSteps: 6%|▋ | 45/700 [00:19<04:42, 2.32it/s, loss=0.152, lr=0.0001]\nSteps: 6%|▋ | 45/700 [00:19<04:42, 2.32it/s, loss=0.0641, lr=0.0001]\nSteps: 7%|▋ | 46/700 [00:20<04:42, 2.31it/s, loss=0.0641, lr=0.0001]\nSteps: 7%|▋ | 46/700 [00:20<04:42, 2.31it/s, loss=0.0736, lr=0.0001]\nSteps: 7%|▋ | 47/700 [00:20<04:42, 2.31it/s, loss=0.0736, lr=0.0001]\nSteps: 7%|▋ | 47/700 [00:20<04:42, 2.31it/s, loss=0.0928, lr=0.0001]\nSteps: 7%|▋ | 48/700 [00:20<04:41, 2.32it/s, loss=0.0928, lr=0.0001]\nSteps: 7%|▋ | 48/700 [00:20<04:41, 2.32it/s, loss=0.115, lr=0.0001] \nSteps: 7%|▋ | 49/700 [00:21<04:42, 2.30it/s, loss=0.115, lr=0.0001]\nSteps: 7%|▋ | 49/700 [00:21<04:42, 2.30it/s, loss=0.105, lr=0.0001]\nSteps: 7%|▋ | 50/700 [00:21<04:41, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 7%|▋ | 50/700 [00:21<04:41, 2.31it/s, loss=0.0713, lr=0.0001]\nSteps: 7%|▋ | 51/700 [00:22<04:41, 2.31it/s, loss=0.0713, lr=0.0001]\nSteps: 7%|▋ | 51/700 [00:22<04:41, 2.31it/s, loss=0.0728, lr=0.0001]\nSteps: 7%|▋ | 52/700 [00:22<04:40, 2.31it/s, loss=0.0728, lr=0.0001]\nSteps: 7%|▋ | 52/700 [00:22<04:40, 2.31it/s, loss=0.0927, lr=0.0001]\nSteps: 8%|▊ | 53/700 [00:23<04:39, 2.31it/s, loss=0.0927, lr=0.0001]\nSteps: 8%|▊ | 53/700 [00:23<04:39, 2.31it/s, loss=0.119, lr=0.0001] \nSteps: 8%|▊ | 54/700 [00:23<04:39, 2.31it/s, loss=0.119, lr=0.0001]\nSteps: 8%|▊ | 54/700 [00:23<04:39, 2.31it/s, loss=0.0595, lr=0.0001]\nSteps: 8%|▊ | 55/700 [00:24<04:38, 2.31it/s, loss=0.0595, lr=0.0001]\nSteps: 8%|▊ | 55/700 [00:24<04:38, 2.31it/s, loss=0.168, lr=0.0001] \nSteps: 8%|▊ | 56/700 [00:24<04:38, 2.31it/s, loss=0.168, lr=0.0001]\nSteps: 8%|▊ | 56/700 [00:24<04:38, 2.31it/s, loss=0.114, lr=0.0001]\nSteps: 8%|▊ | 57/700 [00:24<04:37, 2.31it/s, loss=0.114, lr=0.0001]\nSteps: 8%|▊ | 57/700 [00:24<04:37, 2.31it/s, loss=0.191, lr=0.0001]\nSteps: 8%|▊ | 58/700 [00:25<04:37, 2.31it/s, loss=0.191, lr=0.0001]\nSteps: 8%|▊ | 58/700 [00:25<04:37, 2.31it/s, loss=0.143, lr=0.0001]\nSteps: 8%|▊ | 59/700 [00:25<04:37, 2.31it/s, loss=0.143, lr=0.0001]\nSteps: 8%|▊ | 59/700 [00:25<04:37, 2.31it/s, loss=0.068, lr=0.0001]\nSteps: 9%|▊ | 60/700 [00:26<04:36, 2.32it/s, loss=0.068, lr=0.0001]\nSteps: 9%|▊ | 60/700 [00:26<04:36, 2.32it/s, loss=0.0855, lr=0.0001]\nSteps: 9%|▊ | 61/700 [00:26<04:37, 2.30it/s, loss=0.0855, lr=0.0001]\nSteps: 9%|▊ | 61/700 [00:26<04:37, 2.30it/s, loss=0.0649, lr=0.0001]\nSteps: 9%|▉ | 62/700 [00:27<04:36, 2.31it/s, loss=0.0649, lr=0.0001]\nSteps: 9%|▉ | 62/700 [00:27<04:36, 2.31it/s, loss=0.0905, lr=0.0001]\nSteps: 9%|▉ | 63/700 [00:27<04:35, 2.31it/s, loss=0.0905, lr=0.0001]\nSteps: 9%|▉ | 63/700 [00:27<04:35, 2.31it/s, loss=0.0868, lr=0.0001]\nSteps: 9%|▉ | 64/700 [00:27<04:35, 2.31it/s, loss=0.0868, lr=0.0001]\nSteps: 9%|▉ | 64/700 [00:27<04:35, 2.31it/s, loss=0.0788, lr=0.0001]\nSteps: 9%|▉ | 65/700 [00:28<04:34, 2.31it/s, loss=0.0788, lr=0.0001]\nSteps: 9%|▉ | 65/700 [00:28<04:34, 2.31it/s, loss=0.132, lr=0.0001] \nSteps: 9%|▉ | 66/700 [00:28<04:34, 2.31it/s, loss=0.132, lr=0.0001]\nSteps: 9%|▉ | 66/700 [00:28<04:34, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 10%|▉ | 67/700 [00:29<04:33, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 10%|▉ | 67/700 [00:29<04:33, 2.31it/s, loss=0.0693, lr=0.0001]\nSteps: 10%|▉ | 68/700 [00:29<04:33, 2.31it/s, loss=0.0693, lr=0.0001]\nSteps: 10%|▉ | 68/700 [00:29<04:33, 2.31it/s, loss=0.111, lr=0.0001] \nSteps: 10%|▉ | 69/700 [00:30<04:32, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 10%|▉ | 69/700 [00:30<04:32, 2.31it/s, loss=0.0441, lr=0.0001]\nSteps: 10%|█ | 70/700 [00:30<04:32, 2.31it/s, loss=0.0441, lr=0.0001]\nSteps: 10%|█ | 70/700 [00:30<04:32, 2.31it/s, loss=0.112, lr=0.0001] \nSteps: 10%|█ | 71/700 [00:30<04:31, 2.31it/s, loss=0.112, lr=0.0001]\nSteps: 10%|█ | 71/700 [00:30<04:31, 2.31it/s, loss=0.1, lr=0.0001] \nSteps: 10%|█ | 72/700 [00:31<04:31, 2.32it/s, loss=0.1, lr=0.0001]\nSteps: 10%|█ | 72/700 [00:31<04:31, 2.32it/s, loss=0.3, lr=0.0001]\nSteps: 10%|█ | 73/700 [00:31<04:32, 2.30it/s, loss=0.3, lr=0.0001]\nSteps: 10%|█ | 73/700 [00:31<04:32, 2.30it/s, loss=0.132, lr=0.0001]\nSteps: 11%|█ | 74/700 [00:32<04:31, 2.31it/s, loss=0.132, lr=0.0001]\nSteps: 11%|█ | 74/700 [00:32<04:31, 2.31it/s, loss=0.0758, lr=0.0001]\nSteps: 11%|█ | 75/700 [00:32<04:30, 2.31it/s, loss=0.0758, lr=0.0001]\nSteps: 11%|█ | 75/700 [00:32<04:30, 2.31it/s, loss=0.107, lr=0.0001] \nSteps: 11%|█ | 76/700 [00:33<04:30, 2.31it/s, loss=0.107, lr=0.0001]\nSteps: 11%|█ | 76/700 [00:33<04:30, 2.31it/s, loss=0.0793, lr=0.0001]\nSteps: 11%|█ | 77/700 [00:33<04:29, 2.31it/s, loss=0.0793, lr=0.0001]\nSteps: 11%|█ | 77/700 [00:33<04:29, 2.31it/s, loss=0.0566, lr=0.0001]\nSteps: 11%|█ | 78/700 [00:33<04:28, 2.31it/s, loss=0.0566, lr=0.0001]\nSteps: 11%|█ | 78/700 [00:33<04:28, 2.31it/s, loss=0.187, lr=0.0001] \nSteps: 11%|█▏ | 79/700 [00:34<04:28, 2.31it/s, loss=0.187, lr=0.0001]\nSteps: 11%|█▏ | 79/700 [00:34<04:28, 2.31it/s, loss=0.138, lr=0.0001]\nSteps: 11%|█▏ | 80/700 [00:34<04:28, 2.31it/s, loss=0.138, lr=0.0001]\nSteps: 11%|█▏ | 80/700 [00:34<04:28, 2.31it/s, loss=0.141, lr=0.0001]\nSteps: 12%|█▏ | 81/700 [00:35<04:27, 2.31it/s, loss=0.141, lr=0.0001]\nSteps: 12%|█▏ | 81/700 [00:35<04:27, 2.31it/s, loss=0.0718, lr=0.0001]\nSteps: 12%|█▏ | 82/700 [00:35<04:26, 2.32it/s, loss=0.0718, lr=0.0001]\nSteps: 12%|█▏ | 82/700 [00:35<04:26, 2.32it/s, loss=0.134, lr=0.0001] \nSteps: 12%|█▏ | 83/700 [00:36<04:26, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 12%|█▏ | 83/700 [00:36<04:26, 2.31it/s, loss=0.19, lr=0.0001] \nSteps: 12%|█▏ | 84/700 [00:36<04:26, 2.32it/s, loss=0.19, lr=0.0001]\nSteps: 12%|█▏ | 84/700 [00:36<04:26, 2.32it/s, loss=0.157, lr=0.0001]\nSteps: 12%|█▏ | 85/700 [00:36<04:26, 2.30it/s, loss=0.157, lr=0.0001]\nSteps: 12%|█▏ | 85/700 [00:37<04:26, 2.30it/s, loss=0.0392, lr=0.0001]\nSteps: 12%|█▏ | 86/700 [00:37<04:26, 2.31it/s, loss=0.0392, lr=0.0001]\nSteps: 12%|█▏ | 86/700 [00:37<04:26, 2.31it/s, loss=0.223, lr=0.0001] \nSteps: 12%|█▏ | 87/700 [00:37<04:25, 2.31it/s, loss=0.223, lr=0.0001]\nSteps: 12%|█▏ | 87/700 [00:37<04:25, 2.31it/s, loss=0.0923, lr=0.0001]\nSteps: 13%|█▎ | 88/700 [00:38<04:24, 2.31it/s, loss=0.0923, lr=0.0001]\nSteps: 13%|█▎ | 88/700 [00:38<04:24, 2.31it/s, loss=0.0809, lr=0.0001]\nSteps: 13%|█▎ | 89/700 [00:38<04:24, 2.31it/s, loss=0.0809, lr=0.0001]\nSteps: 13%|█▎ | 89/700 [00:38<04:24, 2.31it/s, loss=0.0959, lr=0.0001]\nSteps: 13%|█▎ | 90/700 [00:39<04:23, 2.31it/s, loss=0.0959, lr=0.0001]\nSteps: 13%|█▎ | 90/700 [00:39<04:23, 2.31it/s, loss=0.0515, lr=0.0001]\nSteps: 13%|█▎ | 91/700 [00:39<04:23, 2.31it/s, loss=0.0515, lr=0.0001]\nSteps: 13%|█▎ | 91/700 [00:39<04:23, 2.31it/s, loss=0.0861, lr=0.0001]\nSteps: 13%|█▎ | 92/700 [00:40<04:22, 2.31it/s, loss=0.0861, lr=0.0001]\nSteps: 13%|█▎ | 92/700 [00:40<04:22, 2.31it/s, loss=0.0618, lr=0.0001]\nSteps: 13%|█▎ | 93/700 [00:40<04:22, 2.31it/s, loss=0.0618, lr=0.0001]\nSteps: 13%|█▎ | 93/700 [00:40<04:22, 2.31it/s, loss=0.0733, lr=0.0001]\nSteps: 13%|█▎ | 94/700 [00:40<04:21, 2.31it/s, loss=0.0733, lr=0.0001]\nSteps: 13%|█▎ | 94/700 [00:40<04:21, 2.31it/s, loss=0.164, lr=0.0001] \nSteps: 14%|█▎ | 95/700 [00:41<04:21, 2.32it/s, loss=0.164, lr=0.0001]\nSteps: 14%|█▎ | 95/700 [00:41<04:21, 2.32it/s, loss=0.123, lr=0.0001]\nSteps: 14%|█▎ | 96/700 [00:41<04:20, 2.32it/s, loss=0.123, lr=0.0001]\nSteps: 14%|█▎ | 96/700 [00:41<04:20, 2.32it/s, loss=0.185, lr=0.0001]\nSteps: 14%|█▍ | 97/700 [00:42<04:21, 2.30it/s, loss=0.185, lr=0.0001]\nSteps: 14%|█▍ | 97/700 [00:42<04:21, 2.30it/s, loss=0.0795, lr=0.0001]\nSteps: 14%|█▍ | 98/700 [00:42<04:20, 2.31it/s, loss=0.0795, lr=0.0001]\nSteps: 14%|█▍ | 98/700 [00:42<04:20, 2.31it/s, loss=0.124, lr=0.0001] \nSteps: 14%|█▍ | 99/700 [00:43<04:20, 2.31it/s, loss=0.124, lr=0.0001]\nSteps: 14%|█▍ | 99/700 [00:43<04:20, 2.31it/s, loss=0.157, lr=0.0001]\nSteps: 14%|█▍ | 100/700 [00:43<04:19, 2.31it/s, loss=0.157, lr=0.0001]\nSteps: 14%|█▍ | 100/700 [00:43<04:19, 2.31it/s, loss=0.0614, lr=0.0001]\nSteps: 14%|█▍ | 101/700 [00:43<04:19, 2.31it/s, loss=0.0614, lr=0.0001]\nSteps: 14%|█▍ | 101/700 [00:43<04:19, 2.31it/s, loss=0.0955, lr=0.0001]\nSteps: 15%|█▍ | 102/700 [00:44<04:18, 2.31it/s, loss=0.0955, lr=0.0001]\nSteps: 15%|█▍ | 102/700 [00:44<04:18, 2.31it/s, loss=0.0545, lr=0.0001]\nSteps: 15%|█▍ | 103/700 [00:44<04:18, 2.31it/s, loss=0.0545, lr=0.0001]\nSteps: 15%|█▍ | 103/700 [00:44<04:18, 2.31it/s, loss=0.168, lr=0.0001] \nSteps: 15%|█▍ | 104/700 [00:45<04:17, 2.32it/s, loss=0.168, lr=0.0001]\nSteps: 15%|█▍ | 104/700 [00:45<04:17, 2.32it/s, loss=0.0944, lr=0.0001]\nSteps: 15%|█▌ | 105/700 [00:45<04:16, 2.32it/s, loss=0.0944, lr=0.0001]\nSteps: 15%|█▌ | 105/700 [00:45<04:16, 2.32it/s, loss=0.0917, lr=0.0001]\nSteps: 15%|█▌ | 106/700 [00:46<04:16, 2.31it/s, loss=0.0917, lr=0.0001]\nSteps: 15%|█▌ | 106/700 [00:46<04:16, 2.31it/s, loss=0.0696, lr=0.0001]\nSteps: 15%|█▌ | 107/700 [00:46<04:16, 2.31it/s, loss=0.0696, lr=0.0001]\nSteps: 15%|█▌ | 107/700 [00:46<04:16, 2.31it/s, loss=0.15, lr=0.0001] \nSteps: 15%|█▌ | 108/700 [00:46<04:15, 2.32it/s, loss=0.15, lr=0.0001]\nSteps: 15%|█▌ | 108/700 [00:46<04:15, 2.32it/s, loss=0.0707, lr=0.0001]\nSteps: 16%|█▌ | 109/700 [00:47<04:16, 2.30it/s, loss=0.0707, lr=0.0001]\nSteps: 16%|█▌ | 109/700 [00:47<04:16, 2.30it/s, loss=0.281, lr=0.0001] \nSteps: 16%|█▌ | 110/700 [00:47<04:15, 2.31it/s, loss=0.281, lr=0.0001]\nSteps: 16%|█▌ | 110/700 [00:47<04:15, 2.31it/s, loss=0.0787, lr=0.0001]\nSteps: 16%|█▌ | 111/700 [00:48<04:14, 2.31it/s, loss=0.0787, lr=0.0001]\nSteps: 16%|█▌ | 111/700 [00:48<04:14, 2.31it/s, loss=0.139, lr=0.0001] \nSteps: 16%|█▌ | 112/700 [00:48<04:14, 2.31it/s, loss=0.139, lr=0.0001]\nSteps: 16%|█▌ | 112/700 [00:48<04:14, 2.31it/s, loss=0.15, lr=0.0001] \nSteps: 16%|█▌ | 113/700 [00:49<04:13, 2.31it/s, loss=0.15, lr=0.0001]\nSteps: 16%|█▌ | 113/700 [00:49<04:13, 2.31it/s, loss=0.0713, lr=0.0001]\nSteps: 16%|█▋ | 114/700 [00:49<04:13, 2.31it/s, loss=0.0713, lr=0.0001]\nSteps: 16%|█▋ | 114/700 [00:49<04:13, 2.31it/s, loss=0.0331, lr=0.0001]\nSteps: 16%|█▋ | 115/700 [00:49<04:12, 2.31it/s, loss=0.0331, lr=0.0001]\nSteps: 16%|█▋ | 115/700 [00:49<04:12, 2.31it/s, loss=0.0542, lr=0.0001]\nSteps: 17%|█▋ | 116/700 [00:50<04:12, 2.31it/s, loss=0.0542, lr=0.0001]\nSteps: 17%|█▋ | 116/700 [00:50<04:12, 2.31it/s, loss=0.082, lr=0.0001] \nSteps: 17%|█▋ | 117/700 [00:50<04:12, 2.31it/s, loss=0.082, lr=0.0001]\nSteps: 17%|█▋ | 117/700 [00:50<04:12, 2.31it/s, loss=0.215, lr=0.0001]\nSteps: 17%|█▋ | 118/700 [00:51<04:11, 2.31it/s, loss=0.215, lr=0.0001]\nSteps: 17%|█▋ | 118/700 [00:51<04:11, 2.31it/s, loss=0.0356, lr=0.0001]\nSteps: 17%|█▋ | 119/700 [00:51<04:11, 2.31it/s, loss=0.0356, lr=0.0001]\nSteps: 17%|█▋ | 119/700 [00:51<04:11, 2.31it/s, loss=0.156, lr=0.0001] \nSteps: 17%|█▋ | 120/700 [00:52<04:10, 2.31it/s, loss=0.156, lr=0.0001]\nSteps: 17%|█▋ | 120/700 [00:52<04:10, 2.31it/s, loss=0.379, lr=0.0001]\nSteps: 17%|█▋ | 121/700 [00:52<04:11, 2.30it/s, loss=0.379, lr=0.0001]\nSteps: 17%|█▋ | 121/700 [00:52<04:11, 2.30it/s, loss=0.123, lr=0.0001]\nSteps: 17%|█▋ | 122/700 [00:52<04:10, 2.31it/s, loss=0.123, lr=0.0001]\nSteps: 17%|█▋ | 122/700 [00:53<04:10, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 18%|█▊ | 123/700 [00:53<04:09, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 18%|█▊ | 123/700 [00:53<04:09, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 18%|█▊ | 124/700 [00:53<04:09, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 18%|█▊ | 124/700 [00:53<04:09, 2.31it/s, loss=0.042, lr=0.0001]\nSteps: 18%|█▊ | 125/700 [00:54<04:08, 2.31it/s, loss=0.042, lr=0.0001]\nSteps: 18%|█▊ | 125/700 [00:54<04:08, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 18%|█▊ | 126/700 [00:54<04:08, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 18%|█▊ | 126/700 [00:54<04:08, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 18%|█▊ | 127/700 [00:55<04:07, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 18%|█▊ | 127/700 [00:55<04:07, 2.31it/s, loss=0.0841, lr=0.0001]\nSteps: 18%|█▊ | 128/700 [00:55<04:07, 2.31it/s, loss=0.0841, lr=0.0001]\nSteps: 18%|█▊ | 128/700 [00:55<04:07, 2.31it/s, loss=0.0609, lr=0.0001]\nSteps: 18%|█▊ | 129/700 [00:56<04:06, 2.32it/s, loss=0.0609, lr=0.0001]\nSteps: 18%|█▊ | 129/700 [00:56<04:06, 2.32it/s, loss=0.154, lr=0.0001] \nSteps: 19%|█▊ | 130/700 [00:56<04:06, 2.31it/s, loss=0.154, lr=0.0001]\nSteps: 19%|█▊ | 130/700 [00:56<04:06, 2.31it/s, loss=0.0725, lr=0.0001]\nSteps: 19%|█▊ | 131/700 [00:56<04:05, 2.31it/s, loss=0.0725, lr=0.0001]\nSteps: 19%|█▊ | 131/700 [00:56<04:05, 2.31it/s, loss=0.112, lr=0.0001] \nSteps: 19%|█▉ | 132/700 [00:57<04:05, 2.32it/s, loss=0.112, lr=0.0001]\nSteps: 19%|█▉ | 132/700 [00:57<04:05, 2.32it/s, loss=0.0866, lr=0.0001]\nSteps: 19%|█▉ | 133/700 [00:57<04:06, 2.30it/s, loss=0.0866, lr=0.0001]\nSteps: 19%|█▉ | 133/700 [00:57<04:06, 2.30it/s, loss=0.0815, lr=0.0001]\nSteps: 19%|█▉ | 134/700 [00:58<04:05, 2.31it/s, loss=0.0815, lr=0.0001]\nSteps: 19%|█▉ | 134/700 [00:58<04:05, 2.31it/s, loss=0.0781, lr=0.0001]\nSteps: 19%|█▉ | 135/700 [00:58<04:04, 2.31it/s, loss=0.0781, lr=0.0001]\nSteps: 19%|█▉ | 135/700 [00:58<04:04, 2.31it/s, loss=0.0736, lr=0.0001]\nSteps: 19%|█▉ | 136/700 [00:59<04:04, 2.31it/s, loss=0.0736, lr=0.0001]\nSteps: 19%|█▉ | 136/700 [00:59<04:04, 2.31it/s, loss=0.0696, lr=0.0001]\nSteps: 20%|█▉ | 137/700 [00:59<04:03, 2.31it/s, loss=0.0696, lr=0.0001]\nSteps: 20%|█▉ | 137/700 [00:59<04:03, 2.31it/s, loss=0.0871, lr=0.0001]\nSteps: 20%|█▉ | 138/700 [00:59<04:02, 2.31it/s, loss=0.0871, lr=0.0001]\nSteps: 20%|█▉ | 138/700 [00:59<04:02, 2.31it/s, loss=0.0361, lr=0.0001]\nSteps: 20%|█▉ | 139/700 [01:00<04:02, 2.31it/s, loss=0.0361, lr=0.0001]\nSteps: 20%|█▉ | 139/700 [01:00<04:02, 2.31it/s, loss=0.0547, lr=0.0001]\nSteps: 20%|██ | 140/700 [01:00<04:02, 2.31it/s, loss=0.0547, lr=0.0001]\nSteps: 20%|██ | 140/700 [01:00<04:02, 2.31it/s, loss=0.0273, lr=0.0001]\nSteps: 20%|██ | 141/700 [01:01<04:01, 2.32it/s, loss=0.0273, lr=0.0001]\nSteps: 20%|██ | 141/700 [01:01<04:01, 2.32it/s, loss=0.0602, lr=0.0001]\nSteps: 20%|██ | 142/700 [01:01<04:00, 2.32it/s, loss=0.0602, lr=0.0001]\nSteps: 20%|██ | 142/700 [01:01<04:00, 2.32it/s, loss=0.159, lr=0.0001] \nSteps: 20%|██ | 143/700 [01:02<04:00, 2.32it/s, loss=0.159, lr=0.0001]\nSteps: 20%|██ | 143/700 [01:02<04:00, 2.32it/s, loss=0.0487, lr=0.0001]\nSteps: 21%|██ | 144/700 [01:02<04:00, 2.32it/s, loss=0.0487, lr=0.0001]\nSteps: 21%|██ | 144/700 [01:02<04:00, 2.32it/s, loss=0.0591, lr=0.0001]\nSteps: 21%|██ | 145/700 [01:02<04:00, 2.30it/s, loss=0.0591, lr=0.0001]\nSteps: 21%|██ | 145/700 [01:02<04:00, 2.30it/s, loss=0.0889, lr=0.0001]\nSteps: 21%|██ | 146/700 [01:03<04:00, 2.30it/s, loss=0.0889, lr=0.0001]\nSteps: 21%|██ | 146/700 [01:03<04:00, 2.30it/s, loss=0.109, lr=0.0001] \nSteps: 21%|██ | 147/700 [01:03<03:59, 2.31it/s, loss=0.109, lr=0.0001]\nSteps: 21%|██ | 147/700 [01:03<03:59, 2.31it/s, loss=0.0888, lr=0.0001]\nSteps: 21%|██ | 148/700 [01:04<03:58, 2.31it/s, loss=0.0888, lr=0.0001]\nSteps: 21%|██ | 148/700 [01:04<03:58, 2.31it/s, loss=0.163, lr=0.0001] \nSteps: 21%|██▏ | 149/700 [01:04<03:58, 2.31it/s, loss=0.163, lr=0.0001]\nSteps: 21%|██▏ | 149/700 [01:04<03:58, 2.31it/s, loss=0.132, lr=0.0001]\nSteps: 21%|██▏ | 150/700 [01:05<03:57, 2.31it/s, loss=0.132, lr=0.0001]\nSteps: 21%|██▏ | 150/700 [01:05<03:57, 2.31it/s, loss=0.163, lr=0.0001]\nSteps: 22%|██▏ | 151/700 [01:05<03:57, 2.31it/s, loss=0.163, lr=0.0001]\nSteps: 22%|██▏ | 151/700 [01:05<03:57, 2.31it/s, loss=0.0821, lr=0.0001]\nSteps: 22%|██▏ | 152/700 [01:05<03:57, 2.31it/s, loss=0.0821, lr=0.0001]\nSteps: 22%|██▏ | 152/700 [01:05<03:57, 2.31it/s, loss=0.136, lr=0.0001] \nSteps: 22%|██▏ | 153/700 [01:06<03:56, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 22%|██▏ | 153/700 [01:06<03:56, 2.31it/s, loss=0.0459, lr=0.0001]\nSteps: 22%|██▏ | 154/700 [01:06<03:56, 2.31it/s, loss=0.0459, lr=0.0001]\nSteps: 22%|██▏ | 154/700 [01:06<03:56, 2.31it/s, loss=0.106, lr=0.0001] \nSteps: 22%|██▏ | 155/700 [01:07<03:55, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 22%|██▏ | 155/700 [01:07<03:55, 2.31it/s, loss=0.0971, lr=0.0001]\nSteps: 22%|██▏ | 156/700 [01:07<03:55, 2.31it/s, loss=0.0971, lr=0.0001]\nSteps: 22%|██▏ | 156/700 [01:07<03:55, 2.31it/s, loss=0.0542, lr=0.0001]\nSteps: 22%|██▏ | 157/700 [01:08<03:56, 2.30it/s, loss=0.0542, lr=0.0001]\nSteps: 22%|██▏ | 157/700 [01:08<03:56, 2.30it/s, loss=0.078, lr=0.0001] \nSteps: 23%|██▎ | 158/700 [01:08<03:55, 2.30it/s, loss=0.078, lr=0.0001]\nSteps: 23%|██▎ | 158/700 [01:08<03:55, 2.30it/s, loss=0.106, lr=0.0001]\nSteps: 23%|██▎ | 159/700 [01:08<03:54, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 23%|██▎ | 159/700 [01:09<03:54, 2.31it/s, loss=0.0751, lr=0.0001]\nSteps: 23%|██▎ | 160/700 [01:09<03:53, 2.31it/s, loss=0.0751, lr=0.0001]\nSteps: 23%|██▎ | 160/700 [01:09<03:53, 2.31it/s, loss=0.178, lr=0.0001] \nSteps: 23%|██▎ | 161/700 [01:09<03:53, 2.31it/s, loss=0.178, lr=0.0001]\nSteps: 23%|██▎ | 161/700 [01:09<03:53, 2.31it/s, loss=0.0641, lr=0.0001]\nSteps: 23%|██▎ | 162/700 [01:10<03:52, 2.31it/s, loss=0.0641, lr=0.0001]\nSteps: 23%|██▎ | 162/700 [01:10<03:52, 2.31it/s, loss=0.187, lr=0.0001] \nSteps: 23%|██▎ | 163/700 [01:10<03:52, 2.31it/s, loss=0.187, lr=0.0001]\nSteps: 23%|██▎ | 163/700 [01:10<03:52, 2.31it/s, loss=0.237, lr=0.0001]\nSteps: 23%|██▎ | 164/700 [01:11<03:51, 2.31it/s, loss=0.237, lr=0.0001]\nSteps: 23%|██▎ | 164/700 [01:11<03:51, 2.31it/s, loss=0.0783, lr=0.0001]\nSteps: 24%|██▎ | 165/700 [01:11<03:51, 2.31it/s, loss=0.0783, lr=0.0001]\nSteps: 24%|██▎ | 165/700 [01:11<03:51, 2.31it/s, loss=0.0929, lr=0.0001]\nSteps: 24%|██▎ | 166/700 [01:12<03:50, 2.31it/s, loss=0.0929, lr=0.0001]\nSteps: 24%|██▎ | 166/700 [01:12<03:50, 2.31it/s, loss=0.168, lr=0.0001] \nSteps: 24%|██▍ | 167/700 [01:12<03:50, 2.31it/s, loss=0.168, lr=0.0001]\nSteps: 24%|██▍ | 167/700 [01:12<03:50, 2.31it/s, loss=0.0386, lr=0.0001]\nSteps: 24%|██▍ | 168/700 [01:12<03:49, 2.31it/s, loss=0.0386, lr=0.0001]\nSteps: 24%|██▍ | 168/700 [01:12<03:49, 2.31it/s, loss=0.047, lr=0.0001] \nSteps: 24%|██▍ | 169/700 [01:13<03:50, 2.30it/s, loss=0.047, lr=0.0001]\nSteps: 24%|██▍ | 169/700 [01:13<03:50, 2.30it/s, loss=0.0313, lr=0.0001]\nSteps: 24%|██▍ | 170/700 [01:13<03:50, 2.30it/s, loss=0.0313, lr=0.0001]\nSteps: 24%|██▍ | 170/700 [01:13<03:50, 2.30it/s, loss=0.128, lr=0.0001] \nSteps: 24%|██▍ | 171/700 [01:14<03:49, 2.31it/s, loss=0.128, lr=0.0001]\nSteps: 24%|██▍ | 171/700 [01:14<03:49, 2.31it/s, loss=0.145, lr=0.0001]\nSteps: 25%|██▍ | 172/700 [01:14<03:48, 2.31it/s, loss=0.145, lr=0.0001]\nSteps: 25%|██▍ | 172/700 [01:14<03:48, 2.31it/s, loss=0.0553, lr=0.0001]\nSteps: 25%|██▍ | 173/700 [01:15<03:48, 2.31it/s, loss=0.0553, lr=0.0001]\nSteps: 25%|██▍ | 173/700 [01:15<03:48, 2.31it/s, loss=0.137, lr=0.0001] \nSteps: 25%|██▍ | 174/700 [01:15<03:47, 2.31it/s, loss=0.137, lr=0.0001]\nSteps: 25%|██▍ | 174/700 [01:15<03:47, 2.31it/s, loss=0.0654, lr=0.0001]\nSteps: 25%|██▌ | 175/700 [01:15<03:47, 2.31it/s, loss=0.0654, lr=0.0001]\nSteps: 25%|██▌ | 175/700 [01:15<03:47, 2.31it/s, loss=0.128, lr=0.0001] \nSteps: 25%|██▌ | 176/700 [01:16<03:46, 2.31it/s, loss=0.128, lr=0.0001]\nSteps: 25%|██▌ | 176/700 [01:16<03:46, 2.31it/s, loss=0.31, lr=0.0001] \nSteps: 25%|██▌ | 177/700 [01:16<03:46, 2.31it/s, loss=0.31, lr=0.0001]\nSteps: 25%|██▌ | 177/700 [01:16<03:46, 2.31it/s, loss=0.0623, lr=0.0001]\nSteps: 25%|██▌ | 178/700 [01:17<03:45, 2.31it/s, loss=0.0623, lr=0.0001]\nSteps: 25%|██▌ | 178/700 [01:17<03:45, 2.31it/s, loss=0.102, lr=0.0001] \nSteps: 26%|██▌ | 179/700 [01:17<03:45, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 26%|██▌ | 179/700 [01:17<03:45, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 26%|██▌ | 180/700 [01:18<03:44, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 26%|██▌ | 180/700 [01:18<03:44, 2.31it/s, loss=0.0696, lr=0.0001]\nSteps: 26%|██▌ | 181/700 [01:18<03:45, 2.30it/s, loss=0.0696, lr=0.0001]\nSteps: 26%|██▌ | 181/700 [01:18<03:45, 2.30it/s, loss=0.156, lr=0.0001] \nSteps: 26%|██▌ | 182/700 [01:18<03:44, 2.30it/s, loss=0.156, lr=0.0001]\nSteps: 26%|██▌ | 182/700 [01:18<03:44, 2.30it/s, loss=0.0437, lr=0.0001]\nSteps: 26%|██▌ | 183/700 [01:19<03:44, 2.31it/s, loss=0.0437, lr=0.0001]\nSteps: 26%|██▌ | 183/700 [01:19<03:44, 2.31it/s, loss=0.0516, lr=0.0001]\nSteps: 26%|██▋ | 184/700 [01:19<03:43, 2.31it/s, loss=0.0516, lr=0.0001]\nSteps: 26%|██▋ | 184/700 [01:19<03:43, 2.31it/s, loss=0.198, lr=0.0001] \nSteps: 26%|██▋ | 185/700 [01:20<03:43, 2.31it/s, loss=0.198, lr=0.0001]\nSteps: 26%|██▋ | 185/700 [01:20<03:43, 2.31it/s, loss=0.0919, lr=0.0001]\nSteps: 27%|██▋ | 186/700 [01:20<03:42, 2.31it/s, loss=0.0919, lr=0.0001]\nSteps: 27%|██▋ | 186/700 [01:20<03:42, 2.31it/s, loss=0.0468, lr=0.0001]\nSteps: 27%|██▋ | 187/700 [01:21<03:42, 2.31it/s, loss=0.0468, lr=0.0001]\nSteps: 27%|██▋ | 187/700 [01:21<03:42, 2.31it/s, loss=0.103, lr=0.0001] \nSteps: 27%|██▋ | 188/700 [01:21<03:41, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 27%|██▋ | 188/700 [01:21<03:41, 2.31it/s, loss=0.21, lr=0.0001] \nSteps: 27%|██▋ | 189/700 [01:21<03:41, 2.31it/s, loss=0.21, lr=0.0001]\nSteps: 27%|██▋ | 189/700 [01:22<03:41, 2.31it/s, loss=0.19, lr=0.0001]\nSteps: 27%|██▋ | 190/700 [01:22<03:40, 2.31it/s, loss=0.19, lr=0.0001]\nSteps: 27%|██▋ | 190/700 [01:22<03:40, 2.31it/s, loss=0.0909, lr=0.0001]\nSteps: 27%|██▋ | 191/700 [01:22<03:40, 2.31it/s, loss=0.0909, lr=0.0001]\nSteps: 27%|██▋ | 191/700 [01:22<03:40, 2.31it/s, loss=0.138, lr=0.0001] \nSteps: 27%|██▋ | 192/700 [01:23<03:39, 2.31it/s, loss=0.138, lr=0.0001]\nSteps: 27%|██▋ | 192/700 [01:23<03:39, 2.31it/s, loss=0.0615, lr=0.0001]\nSteps: 28%|██▊ | 193/700 [01:23<03:40, 2.30it/s, loss=0.0615, lr=0.0001]\nSteps: 28%|██▊ | 193/700 [01:23<03:40, 2.30it/s, loss=0.0493, lr=0.0001]\nSteps: 28%|██▊ | 194/700 [01:24<03:39, 2.30it/s, loss=0.0493, lr=0.0001]\nSteps: 28%|██▊ | 194/700 [01:24<03:39, 2.30it/s, loss=0.0843, lr=0.0001]\nSteps: 28%|██▊ | 195/700 [01:24<03:38, 2.31it/s, loss=0.0843, lr=0.0001]\nSteps: 28%|██▊ | 195/700 [01:24<03:38, 2.31it/s, loss=0.126, lr=0.0001] \nSteps: 28%|██▊ | 196/700 [01:25<03:38, 2.31it/s, loss=0.126, lr=0.0001]\nSteps: 28%|██▊ | 196/700 [01:25<03:38, 2.31it/s, loss=0.288, lr=0.0001]\nSteps: 28%|██▊ | 197/700 [01:25<03:37, 2.31it/s, loss=0.288, lr=0.0001]\nSteps: 28%|██▊ | 197/700 [01:25<03:37, 2.31it/s, loss=0.237, lr=0.0001]\nSteps: 28%|██▊ | 198/700 [01:25<03:37, 2.31it/s, loss=0.237, lr=0.0001]\nSteps: 28%|██▊ | 198/700 [01:25<03:37, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 28%|██▊ | 199/700 [01:26<03:36, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 28%|██▊ | 199/700 [01:26<03:36, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 29%|██▊ | 200/700 [01:26<03:36, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 29%|██▊ | 200/700 [01:26<03:36, 2.31it/s, loss=0.152, lr=0.0001]\nSteps: 29%|██▊ | 201/700 [01:27<03:35, 2.31it/s, loss=0.152, lr=0.0001]\nSteps: 29%|██▊ | 201/700 [01:27<03:35, 2.31it/s, loss=0.182, lr=0.0001]\nSteps: 29%|██▉ | 202/700 [01:27<03:35, 2.31it/s, loss=0.182, lr=0.0001]\nSteps: 29%|██▉ | 202/700 [01:27<03:35, 2.31it/s, loss=0.0467, lr=0.0001]\nSteps: 29%|██▉ | 203/700 [01:28<03:35, 2.31it/s, loss=0.0467, lr=0.0001]\nSteps: 29%|██▉ | 203/700 [01:28<03:35, 2.31it/s, loss=0.126, lr=0.0001] \nSteps: 29%|██▉ | 204/700 [01:28<03:34, 2.31it/s, loss=0.126, lr=0.0001]\nSteps: 29%|██▉ | 204/700 [01:28<03:34, 2.31it/s, loss=0.0631, lr=0.0001]\nSteps: 29%|██▉ | 205/700 [01:28<03:35, 2.30it/s, loss=0.0631, lr=0.0001]\nSteps: 29%|██▉ | 205/700 [01:28<03:35, 2.30it/s, loss=0.0418, lr=0.0001]\nSteps: 29%|██▉ | 206/700 [01:29<03:34, 2.30it/s, loss=0.0418, lr=0.0001]\nSteps: 29%|██▉ | 206/700 [01:29<03:34, 2.30it/s, loss=0.133, lr=0.0001] \nSteps: 30%|██▉ | 207/700 [01:29<03:34, 2.30it/s, loss=0.133, lr=0.0001]\nSteps: 30%|██▉ | 207/700 [01:29<03:34, 2.30it/s, loss=0.0892, lr=0.0001]\nSteps: 30%|██▉ | 208/700 [01:30<03:33, 2.31it/s, loss=0.0892, lr=0.0001]\nSteps: 30%|██▉ | 208/700 [01:30<03:33, 2.31it/s, loss=0.103, lr=0.0001] \nSteps: 30%|██▉ | 209/700 [01:30<03:32, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 30%|██▉ | 209/700 [01:30<03:32, 2.31it/s, loss=0.178, lr=0.0001]\nSteps: 30%|███ | 210/700 [01:31<03:32, 2.31it/s, loss=0.178, lr=0.0001]\nSteps: 30%|███ | 210/700 [01:31<03:32, 2.31it/s, loss=0.0359, lr=0.0001]\nSteps: 30%|███ | 211/700 [01:31<03:31, 2.31it/s, loss=0.0359, lr=0.0001]\nSteps: 30%|███ | 211/700 [01:31<03:31, 2.31it/s, loss=0.0537, lr=0.0001]\nSteps: 30%|███ | 212/700 [01:31<03:31, 2.31it/s, loss=0.0537, lr=0.0001]\nSteps: 30%|███ | 212/700 [01:31<03:31, 2.31it/s, loss=0.0484, lr=0.0001]\nSteps: 30%|███ | 213/700 [01:32<03:31, 2.31it/s, loss=0.0484, lr=0.0001]\nSteps: 30%|███ | 213/700 [01:32<03:31, 2.31it/s, loss=0.02, lr=0.0001] \nSteps: 31%|███ | 214/700 [01:32<03:30, 2.31it/s, loss=0.02, lr=0.0001]\nSteps: 31%|███ | 214/700 [01:32<03:30, 2.31it/s, loss=0.0563, lr=0.0001]\nSteps: 31%|███ | 215/700 [01:33<03:29, 2.31it/s, loss=0.0563, lr=0.0001]\nSteps: 31%|███ | 215/700 [01:33<03:29, 2.31it/s, loss=0.0508, lr=0.0001]\nSteps: 31%|███ | 216/700 [01:33<03:29, 2.31it/s, loss=0.0508, lr=0.0001]\nSteps: 31%|███ | 216/700 [01:33<03:29, 2.31it/s, loss=0.0738, lr=0.0001]\nSteps: 31%|███ | 217/700 [01:34<03:30, 2.30it/s, loss=0.0738, lr=0.0001]\nSteps: 31%|███ | 217/700 [01:34<03:30, 2.30it/s, loss=0.0832, lr=0.0001]\nSteps: 31%|███ | 218/700 [01:34<03:29, 2.30it/s, loss=0.0832, lr=0.0001]\nSteps: 31%|███ | 218/700 [01:34<03:29, 2.30it/s, loss=0.151, lr=0.0001] \nSteps: 31%|███▏ | 219/700 [01:34<03:28, 2.31it/s, loss=0.151, lr=0.0001]\nSteps: 31%|███▏ | 219/700 [01:35<03:28, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 31%|███▏ | 220/700 [01:35<03:27, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 31%|███▏ | 220/700 [01:35<03:27, 2.31it/s, loss=0.074, lr=0.0001]\nSteps: 32%|███▏ | 221/700 [01:35<03:27, 2.31it/s, loss=0.074, lr=0.0001]\nSteps: 32%|███▏ | 221/700 [01:35<03:27, 2.31it/s, loss=0.15, lr=0.0001] \nSteps: 32%|███▏ | 222/700 [01:36<03:26, 2.31it/s, loss=0.15, lr=0.0001]\nSteps: 32%|███▏ | 222/700 [01:36<03:26, 2.31it/s, loss=0.0893, lr=0.0001]\nSteps: 32%|███▏ | 223/700 [01:36<03:26, 2.31it/s, loss=0.0893, lr=0.0001]\nSteps: 32%|███▏ | 223/700 [01:36<03:26, 2.31it/s, loss=0.118, lr=0.0001] \nSteps: 32%|███▏ | 224/700 [01:37<03:25, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 32%|███▏ | 224/700 [01:37<03:25, 2.31it/s, loss=0.156, lr=0.0001]\nSteps: 32%|███▏ | 225/700 [01:37<03:25, 2.31it/s, loss=0.156, lr=0.0001]\nSteps: 32%|███▏ | 225/700 [01:37<03:25, 2.31it/s, loss=0.0856, lr=0.0001]\nSteps: 32%|███▏ | 226/700 [01:38<03:25, 2.31it/s, loss=0.0856, lr=0.0001]\nSteps: 32%|███▏ | 226/700 [01:38<03:25, 2.31it/s, loss=0.142, lr=0.0001] \nSteps: 32%|███▏ | 227/700 [01:38<03:24, 2.31it/s, loss=0.142, lr=0.0001]\nSteps: 32%|███▏ | 227/700 [01:38<03:24, 2.31it/s, loss=0.135, lr=0.0001]\nSteps: 33%|███▎ | 228/700 [01:38<03:24, 2.31it/s, loss=0.135, lr=0.0001]\nSteps: 33%|███▎ | 228/700 [01:38<03:24, 2.31it/s, loss=0.0868, lr=0.0001]\nSteps: 33%|███▎ | 229/700 [01:39<03:24, 2.30it/s, loss=0.0868, lr=0.0001]\nSteps: 33%|███▎ | 229/700 [01:39<03:24, 2.30it/s, loss=0.0699, lr=0.0001]\nSteps: 33%|███▎ | 230/700 [01:39<03:23, 2.31it/s, loss=0.0699, lr=0.0001]\nSteps: 33%|███▎ | 230/700 [01:39<03:23, 2.31it/s, loss=0.111, lr=0.0001] \nSteps: 33%|███▎ | 231/700 [01:40<03:23, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 33%|███▎ | 231/700 [01:40<03:23, 2.31it/s, loss=0.0788, lr=0.0001]\nSteps: 33%|███▎ | 232/700 [01:40<03:22, 2.31it/s, loss=0.0788, lr=0.0001]\nSteps: 33%|███▎ | 232/700 [01:40<03:22, 2.31it/s, loss=0.0501, lr=0.0001]\nSteps: 33%|███▎ | 233/700 [01:41<03:22, 2.31it/s, loss=0.0501, lr=0.0001]\nSteps: 33%|███▎ | 233/700 [01:41<03:22, 2.31it/s, loss=0.0609, lr=0.0001]\nSteps: 33%|███▎ | 234/700 [01:41<03:21, 2.31it/s, loss=0.0609, lr=0.0001]\nSteps: 33%|███▎ | 234/700 [01:41<03:21, 2.31it/s, loss=0.0557, lr=0.0001]\nSteps: 34%|███▎ | 235/700 [01:41<03:21, 2.31it/s, loss=0.0557, lr=0.0001]\nSteps: 34%|███▎ | 235/700 [01:41<03:21, 2.31it/s, loss=0.0626, lr=0.0001]\nSteps: 34%|███▎ | 236/700 [01:42<03:21, 2.31it/s, loss=0.0626, lr=0.0001]\nSteps: 34%|███▎ | 236/700 [01:42<03:21, 2.31it/s, loss=0.23, lr=0.0001] \nSteps: 34%|███▍ | 237/700 [01:42<03:20, 2.30it/s, loss=0.23, lr=0.0001]\nSteps: 34%|███▍ | 237/700 [01:42<03:20, 2.30it/s, loss=0.186, lr=0.0001]\nSteps: 34%|███▍ | 238/700 [01:43<03:20, 2.31it/s, loss=0.186, lr=0.0001]\nSteps: 34%|███▍ | 238/700 [01:43<03:20, 2.31it/s, loss=0.067, lr=0.0001]\nSteps: 34%|███▍ | 239/700 [01:43<03:19, 2.31it/s, loss=0.067, lr=0.0001]\nSteps: 34%|███▍ | 239/700 [01:43<03:19, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 34%|███▍ | 240/700 [01:44<03:19, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 34%|███▍ | 240/700 [01:44<03:19, 2.31it/s, loss=0.0939, lr=0.0001]\nSteps: 34%|███▍ | 241/700 [01:44<03:19, 2.30it/s, loss=0.0939, lr=0.0001]\nSteps: 34%|███▍ | 241/700 [01:44<03:19, 2.30it/s, loss=0.0754, lr=0.0001]\nSteps: 35%|███▍ | 242/700 [01:44<03:19, 2.30it/s, loss=0.0754, lr=0.0001]\nSteps: 35%|███▍ | 242/700 [01:44<03:19, 2.30it/s, loss=0.214, lr=0.0001] \nSteps: 35%|███▍ | 243/700 [01:45<03:18, 2.30it/s, loss=0.214, lr=0.0001]\nSteps: 35%|███▍ | 243/700 [01:45<03:18, 2.30it/s, loss=0.096, lr=0.0001]\nSteps: 35%|███▍ | 244/700 [01:45<03:17, 2.31it/s, loss=0.096, lr=0.0001]\nSteps: 35%|███▍ | 244/700 [01:45<03:17, 2.31it/s, loss=0.0839, lr=0.0001]\nSteps: 35%|███▌ | 245/700 [01:46<03:17, 2.31it/s, loss=0.0839, lr=0.0001]\nSteps: 35%|███▌ | 245/700 [01:46<03:17, 2.31it/s, loss=0.133, lr=0.0001] \nSteps: 35%|███▌ | 246/700 [01:46<03:16, 2.31it/s, loss=0.133, lr=0.0001]\nSteps: 35%|███▌ | 246/700 [01:46<03:16, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 35%|███▌ | 247/700 [01:47<03:16, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 35%|███▌ | 247/700 [01:47<03:16, 2.31it/s, loss=0.0977, lr=0.0001]\nSteps: 35%|███▌ | 248/700 [01:47<03:15, 2.31it/s, loss=0.0977, lr=0.0001]\nSteps: 35%|███▌ | 248/700 [01:47<03:15, 2.31it/s, loss=0.164, lr=0.0001] \nSteps: 36%|███▌ | 249/700 [01:47<03:15, 2.31it/s, loss=0.164, lr=0.0001]\nSteps: 36%|███▌ | 249/700 [01:48<03:15, 2.31it/s, loss=0.059, lr=0.0001]\nSteps: 36%|███▌ | 250/700 [01:48<03:14, 2.31it/s, loss=0.059, lr=0.0001]\nSteps: 36%|███▌ | 250/700 [01:48<03:14, 2.31it/s, loss=0.052, lr=0.0001]\nSteps: 36%|███▌ | 251/700 [01:48<03:14, 2.31it/s, loss=0.052, lr=0.0001]\nSteps: 36%|███▌ | 251/700 [01:48<03:14, 2.31it/s, loss=0.115, lr=0.0001]\nSteps: 36%|███▌ | 252/700 [01:49<03:14, 2.31it/s, loss=0.115, lr=0.0001]\nSteps: 36%|███▌ | 252/700 [01:49<03:14, 2.31it/s, loss=0.0825, lr=0.0001]\nSteps: 36%|███▌ | 253/700 [01:49<03:14, 2.30it/s, loss=0.0825, lr=0.0001]\nSteps: 36%|███▌ | 253/700 [01:49<03:14, 2.30it/s, loss=0.047, lr=0.0001] \nSteps: 36%|███▋ | 254/700 [01:50<03:13, 2.30it/s, loss=0.047, lr=0.0001]\nSteps: 36%|███▋ | 254/700 [01:50<03:13, 2.30it/s, loss=0.0716, lr=0.0001]\nSteps: 36%|███▋ | 255/700 [01:50<03:13, 2.30it/s, loss=0.0716, lr=0.0001]\nSteps: 36%|███▋ | 255/700 [01:50<03:13, 2.30it/s, loss=0.0739, lr=0.0001]\nSteps: 37%|███▋ | 256/700 [01:51<03:12, 2.31it/s, loss=0.0739, lr=0.0001]\nSteps: 37%|███▋ | 256/700 [01:51<03:12, 2.31it/s, loss=0.162, lr=0.0001] \nSteps: 37%|███▋ | 257/700 [01:51<03:11, 2.31it/s, loss=0.162, lr=0.0001]\nSteps: 37%|███▋ | 257/700 [01:51<03:11, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 37%|███▋ | 258/700 [01:51<03:11, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 37%|███▋ | 258/700 [01:51<03:11, 2.31it/s, loss=0.0502, lr=0.0001]\nSteps: 37%|███▋ | 259/700 [01:52<03:10, 2.31it/s, loss=0.0502, lr=0.0001]\nSteps: 37%|███▋ | 259/700 [01:52<03:10, 2.31it/s, loss=0.0932, lr=0.0001]\nSteps: 37%|███▋ | 260/700 [01:52<03:10, 2.31it/s, loss=0.0932, lr=0.0001]\nSteps: 37%|███▋ | 260/700 [01:52<03:10, 2.31it/s, loss=0.0996, lr=0.0001]\nSteps: 37%|███▋ | 261/700 [01:53<03:09, 2.31it/s, loss=0.0996, lr=0.0001]\nSteps: 37%|███▋ | 261/700 [01:53<03:09, 2.31it/s, loss=0.0506, lr=0.0001]\nSteps: 37%|███▋ | 262/700 [01:53<03:09, 2.31it/s, loss=0.0506, lr=0.0001]\nSteps: 37%|███▋ | 262/700 [01:53<03:09, 2.31it/s, loss=0.184, lr=0.0001] \nSteps: 38%|███▊ | 263/700 [01:54<03:09, 2.31it/s, loss=0.184, lr=0.0001]\nSteps: 38%|███▊ | 263/700 [01:54<03:09, 2.31it/s, loss=0.0982, lr=0.0001]\nSteps: 38%|███▊ | 264/700 [01:54<03:08, 2.31it/s, loss=0.0982, lr=0.0001]\nSteps: 38%|███▊ | 264/700 [01:54<03:08, 2.31it/s, loss=0.0981, lr=0.0001]\nSteps: 38%|███▊ | 265/700 [01:54<03:08, 2.30it/s, loss=0.0981, lr=0.0001]\nSteps: 38%|███▊ | 265/700 [01:54<03:08, 2.30it/s, loss=0.0722, lr=0.0001]\nSteps: 38%|███▊ | 266/700 [01:55<03:08, 2.30it/s, loss=0.0722, lr=0.0001]\nSteps: 38%|███▊ | 266/700 [01:55<03:08, 2.30it/s, loss=0.085, lr=0.0001] \nSteps: 38%|███▊ | 267/700 [01:55<03:07, 2.31it/s, loss=0.085, lr=0.0001]\nSteps: 38%|███▊ | 267/700 [01:55<03:07, 2.31it/s, loss=0.0857, lr=0.0001]\nSteps: 38%|███▊ | 268/700 [01:56<03:07, 2.31it/s, loss=0.0857, lr=0.0001]\nSteps: 38%|███▊ | 268/700 [01:56<03:07, 2.31it/s, loss=0.0924, lr=0.0001]\nSteps: 38%|███▊ | 269/700 [01:56<03:06, 2.31it/s, loss=0.0924, lr=0.0001]\nSteps: 38%|███▊ | 269/700 [01:56<03:06, 2.31it/s, loss=0.0701, lr=0.0001]\nSteps: 39%|███▊ | 270/700 [01:57<03:06, 2.31it/s, loss=0.0701, lr=0.0001]\nSteps: 39%|███▊ | 270/700 [01:57<03:06, 2.31it/s, loss=0.0999, lr=0.0001]\nSteps: 39%|███▊ | 271/700 [01:57<03:05, 2.31it/s, loss=0.0999, lr=0.0001]\nSteps: 39%|███▊ | 271/700 [01:57<03:05, 2.31it/s, loss=0.106, lr=0.0001] \nSteps: 39%|███▉ | 272/700 [01:57<03:05, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 39%|███▉ | 272/700 [01:57<03:05, 2.31it/s, loss=0.0785, lr=0.0001]\nSteps: 39%|███▉ | 273/700 [01:58<03:04, 2.31it/s, loss=0.0785, lr=0.0001]\nSteps: 39%|███▉ | 273/700 [01:58<03:04, 2.31it/s, loss=0.121, lr=0.0001] \nSteps: 39%|███▉ | 274/700 [01:58<03:04, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 39%|███▉ | 274/700 [01:58<03:04, 2.31it/s, loss=0.0753, lr=0.0001]\nSteps: 39%|███▉ | 275/700 [01:59<03:04, 2.31it/s, loss=0.0753, lr=0.0001]\nSteps: 39%|███▉ | 275/700 [01:59<03:04, 2.31it/s, loss=0.0554, lr=0.0001]\nSteps: 39%|███▉ | 276/700 [01:59<03:03, 2.31it/s, loss=0.0554, lr=0.0001]\nSteps: 39%|███▉ | 276/700 [01:59<03:03, 2.31it/s, loss=0.153, lr=0.0001] \nSteps: 40%|███▉ | 277/700 [02:00<03:04, 2.30it/s, loss=0.153, lr=0.0001]\nSteps: 40%|███▉ | 277/700 [02:00<03:04, 2.30it/s, loss=0.117, lr=0.0001]\nSteps: 40%|███▉ | 278/700 [02:00<03:03, 2.30it/s, loss=0.117, lr=0.0001]\nSteps: 40%|███▉ | 278/700 [02:00<03:03, 2.30it/s, loss=0.174, lr=0.0001]\nSteps: 40%|███▉ | 279/700 [02:00<03:02, 2.30it/s, loss=0.174, lr=0.0001]\nSteps: 40%|███▉ | 279/700 [02:01<03:02, 2.30it/s, loss=0.165, lr=0.0001]\nSteps: 40%|████ | 280/700 [02:01<03:02, 2.31it/s, loss=0.165, lr=0.0001]\nSteps: 40%|████ | 280/700 [02:01<03:02, 2.31it/s, loss=0.0458, lr=0.0001]\nSteps: 40%|████ | 281/700 [02:01<03:01, 2.31it/s, loss=0.0458, lr=0.0001]\nSteps: 40%|████ | 281/700 [02:01<03:01, 2.31it/s, loss=0.123, lr=0.0001] \nSteps: 40%|████ | 282/700 [02:02<03:01, 2.31it/s, loss=0.123, lr=0.0001]\nSteps: 40%|████ | 282/700 [02:02<03:01, 2.31it/s, loss=0.0655, lr=0.0001]\nSteps: 40%|████ | 283/700 [02:02<03:00, 2.31it/s, loss=0.0655, lr=0.0001]\nSteps: 40%|████ | 283/700 [02:02<03:00, 2.31it/s, loss=0.173, lr=0.0001] \nSteps: 41%|████ | 284/700 [02:03<03:00, 2.31it/s, loss=0.173, lr=0.0001]\nSteps: 41%|████ | 284/700 [02:03<03:00, 2.31it/s, loss=0.0757, lr=0.0001]\nSteps: 41%|████ | 285/700 [02:03<02:59, 2.31it/s, loss=0.0757, lr=0.0001]\nSteps: 41%|████ | 285/700 [02:03<02:59, 2.31it/s, loss=0.0679, lr=0.0001]\nSteps: 41%|████ | 286/700 [02:04<02:59, 2.31it/s, loss=0.0679, lr=0.0001]\nSteps: 41%|████ | 286/700 [02:04<02:59, 2.31it/s, loss=0.0842, lr=0.0001]\nSteps: 41%|████ | 287/700 [02:04<02:58, 2.31it/s, loss=0.0842, lr=0.0001]\nSteps: 41%|████ | 287/700 [02:04<02:58, 2.31it/s, loss=0.0515, lr=0.0001]\nSteps: 41%|████ | 288/700 [02:04<02:58, 2.31it/s, loss=0.0515, lr=0.0001]\nSteps: 41%|████ | 288/700 [02:04<02:58, 2.31it/s, loss=0.046, lr=0.0001] \nSteps: 41%|████▏ | 289/700 [02:05<02:58, 2.30it/s, loss=0.046, lr=0.0001]\nSteps: 41%|████▏ | 289/700 [02:05<02:58, 2.30it/s, loss=0.0335, lr=0.0001]\nSteps: 41%|████▏ | 290/700 [02:05<02:57, 2.30it/s, loss=0.0335, lr=0.0001]\nSteps: 41%|████▏ | 290/700 [02:05<02:57, 2.30it/s, loss=0.249, lr=0.0001] \nSteps: 42%|████▏ | 291/700 [02:06<02:57, 2.31it/s, loss=0.249, lr=0.0001]\nSteps: 42%|████▏ | 291/700 [02:06<02:57, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 42%|████▏ | 292/700 [02:06<02:56, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 42%|████▏ | 292/700 [02:06<02:56, 2.31it/s, loss=0.11, lr=0.0001] \nSteps: 42%|████▏ | 293/700 [02:07<02:56, 2.31it/s, loss=0.11, lr=0.0001]\nSteps: 42%|████▏ | 293/700 [02:07<02:56, 2.31it/s, loss=0.166, lr=0.0001]\nSteps: 42%|████▏ | 294/700 [02:07<02:55, 2.31it/s, loss=0.166, lr=0.0001]\nSteps: 42%|████▏ | 294/700 [02:07<02:55, 2.31it/s, loss=0.196, lr=0.0001]\nSteps: 42%|████▏ | 295/700 [02:07<02:55, 2.31it/s, loss=0.196, lr=0.0001]\nSteps: 42%|████▏ | 295/700 [02:07<02:55, 2.31it/s, loss=0.16, lr=0.0001] \nSteps: 42%|████▏ | 296/700 [02:08<02:54, 2.31it/s, loss=0.16, lr=0.0001]\nSteps: 42%|████▏ | 296/700 [02:08<02:54, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 42%|████▏ | 297/700 [02:08<02:54, 2.31it/s, loss=0.125, lr=0.0001]\nSteps: 42%|████▏ | 297/700 [02:08<02:54, 2.31it/s, loss=0.0685, lr=0.0001]\nSteps: 43%|████▎ | 298/700 [02:09<02:54, 2.31it/s, loss=0.0685, lr=0.0001]\nSteps: 43%|████▎ | 298/700 [02:09<02:54, 2.31it/s, loss=0.0654, lr=0.0001]\nSteps: 43%|████▎ | 299/700 [02:09<02:53, 2.31it/s, loss=0.0654, lr=0.0001]\nSteps: 43%|████▎ | 299/700 [02:09<02:53, 2.31it/s, loss=0.102, lr=0.0001] \nSteps: 43%|████▎ | 300/700 [02:10<02:53, 2.31it/s, loss=0.102, lr=0.0001]\nSteps: 43%|████▎ | 300/700 [02:10<02:53, 2.31it/s, loss=0.307, lr=0.0001]\nSteps: 43%|████▎ | 301/700 [02:10<02:53, 2.30it/s, loss=0.307, lr=0.0001]\nSteps: 43%|████▎ | 301/700 [02:10<02:53, 2.30it/s, loss=0.0656, lr=0.0001]\nSteps: 43%|████▎ | 302/700 [02:10<02:52, 2.30it/s, loss=0.0656, lr=0.0001]\nSteps: 43%|████▎ | 302/700 [02:10<02:52, 2.30it/s, loss=0.13, lr=0.0001] \nSteps: 43%|████▎ | 303/700 [02:11<02:52, 2.30it/s, loss=0.13, lr=0.0001]\nSteps: 43%|████▎ | 303/700 [02:11<02:52, 2.30it/s, loss=0.147, lr=0.0001]\nSteps: 43%|████▎ | 304/700 [02:11<02:51, 2.31it/s, loss=0.147, lr=0.0001]\nSteps: 43%|████▎ | 304/700 [02:11<02:51, 2.31it/s, loss=0.171, lr=0.0001]\nSteps: 44%|████▎ | 305/700 [02:12<02:51, 2.31it/s, loss=0.171, lr=0.0001]\nSteps: 44%|████▎ | 305/700 [02:12<02:51, 2.31it/s, loss=0.0742, lr=0.0001]\nSteps: 44%|████▎ | 306/700 [02:12<02:50, 2.31it/s, loss=0.0742, lr=0.0001]\nSteps: 44%|████▎ | 306/700 [02:12<02:50, 2.31it/s, loss=0.208, lr=0.0001] \nSteps: 44%|████▍ | 307/700 [02:13<02:50, 2.31it/s, loss=0.208, lr=0.0001]\nSteps: 44%|████▍ | 307/700 [02:13<02:50, 2.31it/s, loss=0.138, lr=0.0001]\nSteps: 44%|████▍ | 308/700 [02:13<02:49, 2.31it/s, loss=0.138, lr=0.0001]\nSteps: 44%|████▍ | 308/700 [02:13<02:49, 2.31it/s, loss=0.0506, lr=0.0001]\nSteps: 44%|████▍ | 309/700 [02:13<02:49, 2.31it/s, loss=0.0506, lr=0.0001]\nSteps: 44%|████▍ | 309/700 [02:14<02:49, 2.31it/s, loss=0.0898, lr=0.0001]\nSteps: 44%|████▍ | 310/700 [02:14<02:48, 2.31it/s, loss=0.0898, lr=0.0001]\nSteps: 44%|████▍ | 310/700 [02:14<02:48, 2.31it/s, loss=0.157, lr=0.0001] \nSteps: 44%|████▍ | 311/700 [02:14<02:48, 2.31it/s, loss=0.157, lr=0.0001]\nSteps: 44%|████▍ | 311/700 [02:14<02:48, 2.31it/s, loss=0.13, lr=0.0001] \nSteps: 45%|████▍ | 312/700 [02:15<02:47, 2.31it/s, loss=0.13, lr=0.0001]\nSteps: 45%|████▍ | 312/700 [02:15<02:47, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 45%|████▍ | 313/700 [02:15<02:48, 2.30it/s, loss=0.104, lr=0.0001]\nSteps: 45%|████▍ | 313/700 [02:15<02:48, 2.30it/s, loss=0.0702, lr=0.0001]\nSteps: 45%|████▍ | 314/700 [02:16<02:47, 2.30it/s, loss=0.0702, lr=0.0001]\nSteps: 45%|████▍ | 314/700 [02:16<02:47, 2.30it/s, loss=0.0639, lr=0.0001]\nSteps: 45%|████▌ | 315/700 [02:16<02:47, 2.31it/s, loss=0.0639, lr=0.0001]\nSteps: 45%|████▌ | 315/700 [02:16<02:47, 2.31it/s, loss=0.0803, lr=0.0001]\nSteps: 45%|████▌ | 316/700 [02:17<02:46, 2.31it/s, loss=0.0803, lr=0.0001]\nSteps: 45%|████▌ | 316/700 [02:17<02:46, 2.31it/s, loss=0.0989, lr=0.0001]\nSteps: 45%|████▌ | 317/700 [02:17<02:45, 2.31it/s, loss=0.0989, lr=0.0001]\nSteps: 45%|████▌ | 317/700 [02:17<02:45, 2.31it/s, loss=0.0508, lr=0.0001]\nSteps: 45%|████▌ | 318/700 [02:17<02:45, 2.31it/s, loss=0.0508, lr=0.0001]\nSteps: 45%|████▌ | 318/700 [02:17<02:45, 2.31it/s, loss=0.0966, lr=0.0001]\nSteps: 46%|████▌ | 319/700 [02:18<02:44, 2.31it/s, loss=0.0966, lr=0.0001]\nSteps: 46%|████▌ | 319/700 [02:18<02:44, 2.31it/s, loss=0.186, lr=0.0001] \nSteps: 46%|████▌ | 320/700 [02:18<02:44, 2.31it/s, loss=0.186, lr=0.0001]\nSteps: 46%|████▌ | 320/700 [02:18<02:44, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 46%|████▌ | 321/700 [02:19<02:44, 2.31it/s, loss=0.113, lr=0.0001]\nSteps: 46%|████▌ | 321/700 [02:19<02:44, 2.31it/s, loss=0.075, lr=0.0001]\nSteps: 46%|████▌ | 322/700 [02:19<02:43, 2.31it/s, loss=0.075, lr=0.0001]\nSteps: 46%|████▌ | 322/700 [02:19<02:43, 2.31it/s, loss=0.1, lr=0.0001] \nSteps: 46%|████▌ | 323/700 [02:20<02:43, 2.31it/s, loss=0.1, lr=0.0001]\nSteps: 46%|████▌ | 323/700 [02:20<02:43, 2.31it/s, loss=0.126, lr=0.0001]\nSteps: 46%|████▋ | 324/700 [02:20<02:42, 2.31it/s, loss=0.126, lr=0.0001]\nSteps: 46%|████▋ | 324/700 [02:20<02:42, 2.31it/s, loss=0.0512, lr=0.0001]\nSteps: 46%|████▋ | 325/700 [02:20<02:43, 2.30it/s, loss=0.0512, lr=0.0001]\nSteps: 46%|████▋ | 325/700 [02:20<02:43, 2.30it/s, loss=0.139, lr=0.0001] \nSteps: 47%|████▋ | 326/700 [02:21<02:42, 2.30it/s, loss=0.139, lr=0.0001]\nSteps: 47%|████▋ | 326/700 [02:21<02:42, 2.30it/s, loss=0.123, lr=0.0001]\nSteps: 47%|████▋ | 327/700 [02:21<02:41, 2.30it/s, loss=0.123, lr=0.0001]\nSteps: 47%|████▋ | 327/700 [02:21<02:41, 2.30it/s, loss=0.124, lr=0.0001]\nSteps: 47%|████▋ | 328/700 [02:22<02:41, 2.31it/s, loss=0.124, lr=0.0001]\nSteps: 47%|████▋ | 328/700 [02:22<02:41, 2.31it/s, loss=0.0366, lr=0.0001]\nSteps: 47%|████▋ | 329/700 [02:22<02:41, 2.30it/s, loss=0.0366, lr=0.0001]\nSteps: 47%|████▋ | 329/700 [02:22<02:41, 2.30it/s, loss=0.0412, lr=0.0001]\nSteps: 47%|████▋ | 330/700 [02:23<02:40, 2.30it/s, loss=0.0412, lr=0.0001]\nSteps: 47%|████▋ | 330/700 [02:23<02:40, 2.30it/s, loss=0.0898, lr=0.0001]\nSteps: 47%|████▋ | 331/700 [02:23<02:40, 2.31it/s, loss=0.0898, lr=0.0001]\nSteps: 47%|████▋ | 331/700 [02:23<02:40, 2.31it/s, loss=0.127, lr=0.0001] \nSteps: 47%|████▋ | 332/700 [02:23<02:39, 2.31it/s, loss=0.127, lr=0.0001]\nSteps: 47%|████▋ | 332/700 [02:23<02:39, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 48%|████▊ | 333/700 [02:24<02:39, 2.30it/s, loss=0.103, lr=0.0001]\nSteps: 48%|████▊ | 333/700 [02:24<02:39, 2.30it/s, loss=0.134, lr=0.0001]\nSteps: 48%|████▊ | 334/700 [02:24<02:38, 2.30it/s, loss=0.134, lr=0.0001]\nSteps: 48%|████▊ | 334/700 [02:24<02:38, 2.30it/s, loss=0.142, lr=0.0001]\nSteps: 48%|████▊ | 335/700 [02:25<02:38, 2.30it/s, loss=0.142, lr=0.0001]\nSteps: 48%|████▊ | 335/700 [02:25<02:38, 2.30it/s, loss=0.0705, lr=0.0001]\nSteps: 48%|████▊ | 336/700 [02:25<02:37, 2.31it/s, loss=0.0705, lr=0.0001]\nSteps: 48%|████▊ | 336/700 [02:25<02:37, 2.31it/s, loss=0.0656, lr=0.0001]\nSteps: 48%|████▊ | 337/700 [02:26<02:38, 2.30it/s, loss=0.0656, lr=0.0001]\nSteps: 48%|████▊ | 337/700 [02:26<02:38, 2.30it/s, loss=0.259, lr=0.0001] \nSteps: 48%|████▊ | 338/700 [02:26<02:37, 2.30it/s, loss=0.259, lr=0.0001]\nSteps: 48%|████▊ | 338/700 [02:26<02:37, 2.30it/s, loss=0.058, lr=0.0001]\nSteps: 48%|████▊ | 339/700 [02:27<02:36, 2.30it/s, loss=0.058, lr=0.0001]\nSteps: 48%|████▊ | 339/700 [02:27<02:36, 2.30it/s, loss=0.0758, lr=0.0001]\nSteps: 49%|████▊ | 340/700 [02:27<02:36, 2.30it/s, loss=0.0758, lr=0.0001]\nSteps: 49%|████▊ | 340/700 [02:27<02:36, 2.30it/s, loss=0.151, lr=0.0001] \nSteps: 49%|████▊ | 341/700 [02:27<02:35, 2.30it/s, loss=0.151, lr=0.0001]\nSteps: 49%|████▊ | 341/700 [02:27<02:35, 2.30it/s, loss=0.0809, lr=0.0001]\nSteps: 49%|████▉ | 342/700 [02:28<02:35, 2.30it/s, loss=0.0809, lr=0.0001]\nSteps: 49%|████▉ | 342/700 [02:28<02:35, 2.30it/s, loss=0.0832, lr=0.0001]\nSteps: 49%|████▉ | 343/700 [02:28<02:34, 2.31it/s, loss=0.0832, lr=0.0001]\nSteps: 49%|████▉ | 343/700 [02:28<02:34, 2.31it/s, loss=0.0567, lr=0.0001]\nSteps: 49%|████▉ | 344/700 [02:29<02:34, 2.30it/s, loss=0.0567, lr=0.0001]\nSteps: 49%|████▉ | 344/700 [02:29<02:34, 2.30it/s, loss=0.192, lr=0.0001] \nSteps: 49%|████▉ | 345/700 [02:29<02:34, 2.30it/s, loss=0.192, lr=0.0001]\nSteps: 49%|████▉ | 345/700 [02:29<02:34, 2.30it/s, loss=0.147, lr=0.0001]\nSteps: 49%|████▉ | 346/700 [02:30<02:33, 2.31it/s, loss=0.147, lr=0.0001]\nSteps: 49%|████▉ | 346/700 [02:30<02:33, 2.31it/s, loss=0.0729, lr=0.0001]\nSteps: 50%|████▉ | 347/700 [02:30<02:33, 2.31it/s, loss=0.0729, lr=0.0001]\nSteps: 50%|████▉ | 347/700 [02:30<02:33, 2.31it/s, loss=0.0998, lr=0.0001]\nSteps: 50%|████▉ | 348/700 [02:30<02:32, 2.31it/s, loss=0.0998, lr=0.0001]\nSteps: 50%|████▉ | 348/700 [02:30<02:32, 2.31it/s, loss=0.132, lr=0.0001] \nSteps: 50%|████▉ | 349/700 [02:31<02:32, 2.30it/s, loss=0.132, lr=0.0001]\nSteps: 50%|████▉ | 349/700 [02:31<02:32, 2.30it/s, loss=0.0276, lr=0.0001]\nSteps: 50%|█████ | 350/700 [02:31<02:32, 2.30it/s, loss=0.0276, lr=0.0001]\nSteps: 50%|█████ | 350/700 [02:31<02:32, 2.30it/s, loss=0.198, lr=0.0001] \nSteps: 50%|█████ | 351/700 [02:32<02:31, 2.30it/s, loss=0.198, lr=0.0001]\nSteps: 50%|█████ | 351/700 [02:32<02:31, 2.30it/s, loss=0.135, lr=0.0001]\nSteps: 50%|█████ | 352/700 [02:32<02:30, 2.31it/s, loss=0.135, lr=0.0001]\nSteps: 50%|█████ | 352/700 [02:32<02:30, 2.31it/s, loss=0.0165, lr=0.0001]\nSteps: 50%|█████ | 353/700 [02:33<02:30, 2.31it/s, loss=0.0165, lr=0.0001]\nSteps: 50%|█████ | 353/700 [02:33<02:30, 2.31it/s, loss=0.0565, lr=0.0001]\nSteps: 51%|█████ | 354/700 [02:33<02:29, 2.31it/s, loss=0.0565, lr=0.0001]\nSteps: 51%|█████ | 354/700 [02:33<02:29, 2.31it/s, loss=0.12, lr=0.0001] \nSteps: 51%|█████ | 355/700 [02:33<02:29, 2.31it/s, loss=0.12, lr=0.0001]\nSteps: 51%|█████ | 355/700 [02:33<02:29, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 51%|█████ | 356/700 [02:34<02:29, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 51%|█████ | 356/700 [02:34<02:29, 2.31it/s, loss=0.0892, lr=0.0001]\nSteps: 51%|█████ | 357/700 [02:34<02:28, 2.31it/s, loss=0.0892, lr=0.0001]\nSteps: 51%|█████ | 357/700 [02:34<02:28, 2.31it/s, loss=0.181, lr=0.0001] \nSteps: 51%|█████ | 358/700 [02:35<02:28, 2.31it/s, loss=0.181, lr=0.0001]\nSteps: 51%|█████ | 358/700 [02:35<02:28, 2.31it/s, loss=0.0601, lr=0.0001]\nSteps: 51%|█████▏ | 359/700 [02:35<02:27, 2.31it/s, loss=0.0601, lr=0.0001]\nSteps: 51%|█████▏ | 359/700 [02:35<02:27, 2.31it/s, loss=0.124, lr=0.0001] \nSteps: 51%|█████▏ | 360/700 [02:36<02:27, 2.31it/s, loss=0.124, lr=0.0001]\nSteps: 51%|█████▏ | 360/700 [02:36<02:27, 2.31it/s, loss=0.0831, lr=0.0001]\nSteps: 52%|█████▏ | 361/700 [02:36<02:27, 2.30it/s, loss=0.0831, lr=0.0001]\nSteps: 52%|█████▏ | 361/700 [02:36<02:27, 2.30it/s, loss=0.0764, lr=0.0001]\nSteps: 52%|█████▏ | 362/700 [02:36<02:26, 2.30it/s, loss=0.0764, lr=0.0001]\nSteps: 52%|█████▏ | 362/700 [02:37<02:26, 2.30it/s, loss=0.189, lr=0.0001] \nSteps: 52%|█████▏ | 363/700 [02:37<02:26, 2.30it/s, loss=0.189, lr=0.0001]\nSteps: 52%|█████▏ | 363/700 [02:37<02:26, 2.30it/s, loss=0.0764, lr=0.0001]\nSteps: 52%|█████▏ | 364/700 [02:37<02:25, 2.31it/s, loss=0.0764, lr=0.0001]\nSteps: 52%|█████▏ | 364/700 [02:37<02:25, 2.31it/s, loss=0.131, lr=0.0001] \nSteps: 52%|█████▏ | 365/700 [02:38<02:25, 2.31it/s, loss=0.131, lr=0.0001]\nSteps: 52%|█████▏ | 365/700 [02:38<02:25, 2.31it/s, loss=0.0874, lr=0.0001]\nSteps: 52%|█████▏ | 366/700 [02:38<02:24, 2.31it/s, loss=0.0874, lr=0.0001]\nSteps: 52%|█████▏ | 366/700 [02:38<02:24, 2.31it/s, loss=0.183, lr=0.0001] \nSteps: 52%|█████▏ | 367/700 [02:39<02:24, 2.31it/s, loss=0.183, lr=0.0001]\nSteps: 52%|█████▏ | 367/700 [02:39<02:24, 2.31it/s, loss=0.0662, lr=0.0001]\nSteps: 53%|█████▎ | 368/700 [02:39<02:23, 2.31it/s, loss=0.0662, lr=0.0001]\nSteps: 53%|█████▎ | 368/700 [02:39<02:23, 2.31it/s, loss=0.112, lr=0.0001] \nSteps: 53%|█████▎ | 369/700 [02:40<02:23, 2.31it/s, loss=0.112, lr=0.0001]\nSteps: 53%|█████▎ | 369/700 [02:40<02:23, 2.31it/s, loss=0.0877, lr=0.0001]\nSteps: 53%|█████▎ | 370/700 [02:40<02:22, 2.31it/s, loss=0.0877, lr=0.0001]\nSteps: 53%|█████▎ | 370/700 [02:40<02:22, 2.31it/s, loss=0.0994, lr=0.0001]\nSteps: 53%|█████▎ | 371/700 [02:40<02:22, 2.31it/s, loss=0.0994, lr=0.0001]\nSteps: 53%|█████▎ | 371/700 [02:40<02:22, 2.31it/s, loss=0.116, lr=0.0001] \nSteps: 53%|█████▎ | 372/700 [02:41<02:21, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 53%|█████▎ | 372/700 [02:41<02:21, 2.31it/s, loss=0.0953, lr=0.0001]\nSteps: 53%|█████▎ | 373/700 [02:41<02:22, 2.30it/s, loss=0.0953, lr=0.0001]\nSteps: 53%|█████▎ | 373/700 [02:41<02:22, 2.30it/s, loss=0.129, lr=0.0001] \nSteps: 53%|█████▎ | 374/700 [02:42<02:21, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 53%|█████▎ | 374/700 [02:42<02:21, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 54%|█████▎ | 375/700 [02:42<02:20, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 54%|█████▎ | 375/700 [02:42<02:20, 2.31it/s, loss=0.0471, lr=0.0001]\nSteps: 54%|█████▎ | 376/700 [02:43<02:20, 2.31it/s, loss=0.0471, lr=0.0001]\nSteps: 54%|█████▎ | 376/700 [02:43<02:20, 2.31it/s, loss=0.0695, lr=0.0001]\nSteps: 54%|█████▍ | 377/700 [02:43<02:19, 2.31it/s, loss=0.0695, lr=0.0001]\nSteps: 54%|█████▍ | 377/700 [02:43<02:19, 2.31it/s, loss=0.078, lr=0.0001] \nSteps: 54%|█████▍ | 378/700 [02:43<02:19, 2.31it/s, loss=0.078, lr=0.0001]\nSteps: 54%|█████▍ | 378/700 [02:43<02:19, 2.31it/s, loss=0.115, lr=0.0001]\nSteps: 54%|█████▍ | 379/700 [02:44<02:18, 2.31it/s, loss=0.115, lr=0.0001]\nSteps: 54%|█████▍ | 379/700 [02:44<02:18, 2.31it/s, loss=0.0692, lr=0.0001]\nSteps: 54%|█████▍ | 380/700 [02:44<02:18, 2.31it/s, loss=0.0692, lr=0.0001]\nSteps: 54%|█████▍ | 380/700 [02:44<02:18, 2.31it/s, loss=0.0489, lr=0.0001]\nSteps: 54%|█████▍ | 381/700 [02:45<02:17, 2.31it/s, loss=0.0489, lr=0.0001]\nSteps: 54%|█████▍ | 381/700 [02:45<02:17, 2.31it/s, loss=0.168, lr=0.0001] \nSteps: 55%|█████▍ | 382/700 [02:45<02:17, 2.31it/s, loss=0.168, lr=0.0001]\nSteps: 55%|█████▍ | 382/700 [02:45<02:17, 2.31it/s, loss=0.0656, lr=0.0001]\nSteps: 55%|█████▍ | 383/700 [02:46<02:27, 2.15it/s, loss=0.0656, lr=0.0001]\nSteps: 55%|█████▍ | 383/700 [02:46<02:27, 2.15it/s, loss=0.209, lr=0.0001] \nSteps: 55%|█████▍ | 384/700 [02:46<02:23, 2.20it/s, loss=0.209, lr=0.0001]\nSteps: 55%|█████▍ | 384/700 [02:46<02:23, 2.20it/s, loss=0.134, lr=0.0001]\nSteps: 55%|█████▌ | 385/700 [02:47<02:21, 2.22it/s, loss=0.134, lr=0.0001]\nSteps: 55%|█████▌ | 385/700 [02:47<02:21, 2.22it/s, loss=0.114, lr=0.0001]\nSteps: 55%|█████▌ | 386/700 [02:47<02:19, 2.25it/s, loss=0.114, lr=0.0001]\nSteps: 55%|█████▌ | 386/700 [02:47<02:19, 2.25it/s, loss=0.109, lr=0.0001]\nSteps: 55%|█████▌ | 387/700 [02:47<02:18, 2.26it/s, loss=0.109, lr=0.0001]\nSteps: 55%|█████▌ | 387/700 [02:47<02:18, 2.26it/s, loss=0.0913, lr=0.0001]\nSteps: 55%|█████▌ | 388/700 [02:48<02:16, 2.28it/s, loss=0.0913, lr=0.0001]\nSteps: 55%|█████▌ | 388/700 [02:48<02:16, 2.28it/s, loss=0.0507, lr=0.0001]\nSteps: 56%|█████▌ | 389/700 [02:48<02:16, 2.28it/s, loss=0.0507, lr=0.0001]\nSteps: 56%|█████▌ | 389/700 [02:48<02:16, 2.28it/s, loss=0.221, lr=0.0001] \nSteps: 56%|█████▌ | 390/700 [02:49<02:15, 2.29it/s, loss=0.221, lr=0.0001]\nSteps: 56%|█████▌ | 390/700 [02:49<02:15, 2.29it/s, loss=0.0575, lr=0.0001]\nSteps: 56%|█████▌ | 391/700 [02:49<02:14, 2.30it/s, loss=0.0575, lr=0.0001]\nSteps: 56%|█████▌ | 391/700 [02:49<02:14, 2.30it/s, loss=0.0787, lr=0.0001]\nSteps: 56%|█████▌ | 392/700 [02:50<02:13, 2.30it/s, loss=0.0787, lr=0.0001]\nSteps: 56%|█████▌ | 392/700 [02:50<02:13, 2.30it/s, loss=0.121, lr=0.0001] \nSteps: 56%|█████▌ | 393/700 [02:50<02:13, 2.30it/s, loss=0.121, lr=0.0001]\nSteps: 56%|█████▌ | 393/700 [02:50<02:13, 2.30it/s, loss=0.0559, lr=0.0001]\nSteps: 56%|█████▋ | 394/700 [02:50<02:12, 2.31it/s, loss=0.0559, lr=0.0001]\nSteps: 56%|█████▋ | 394/700 [02:50<02:12, 2.31it/s, loss=0.0453, lr=0.0001]\nSteps: 56%|█████▋ | 395/700 [02:51<02:12, 2.31it/s, loss=0.0453, lr=0.0001]\nSteps: 56%|█████▋ | 395/700 [02:51<02:12, 2.31it/s, loss=0.0741, lr=0.0001]\nSteps: 57%|█████▋ | 396/700 [02:51<02:11, 2.31it/s, loss=0.0741, lr=0.0001]\nSteps: 57%|█████▋ | 396/700 [02:51<02:11, 2.31it/s, loss=0.138, lr=0.0001] \nSteps: 57%|█████▋ | 397/700 [02:52<02:12, 2.30it/s, loss=0.138, lr=0.0001]\nSteps: 57%|█████▋ | 397/700 [02:52<02:12, 2.30it/s, loss=0.0937, lr=0.0001]\nSteps: 57%|█████▋ | 398/700 [02:52<02:11, 2.30it/s, loss=0.0937, lr=0.0001]\nSteps: 57%|█████▋ | 398/700 [02:52<02:11, 2.30it/s, loss=0.0666, lr=0.0001]\nSteps: 57%|█████▋ | 399/700 [02:53<02:10, 2.30it/s, loss=0.0666, lr=0.0001]\nSteps: 57%|█████▋ | 399/700 [02:53<02:10, 2.30it/s, loss=0.0977, lr=0.0001]\nSteps: 57%|█████▋ | 400/700 [02:53<02:10, 2.31it/s, loss=0.0977, lr=0.0001]\nSteps: 57%|█████▋ | 400/700 [02:53<02:10, 2.31it/s, loss=0.133, lr=0.0001] \nSteps: 57%|█████▋ | 401/700 [02:53<02:09, 2.31it/s, loss=0.133, lr=0.0001]\nSteps: 57%|█████▋ | 401/700 [02:54<02:09, 2.31it/s, loss=0.0634, lr=0.0001]\nSteps: 57%|█████▋ | 402/700 [02:54<02:09, 2.31it/s, loss=0.0634, lr=0.0001]\nSteps: 57%|█████▋ | 402/700 [02:54<02:09, 2.31it/s, loss=0.0826, lr=0.0001]\nSteps: 58%|█████▊ | 403/700 [02:54<02:08, 2.31it/s, loss=0.0826, lr=0.0001]\nSteps: 58%|█████▊ | 403/700 [02:54<02:08, 2.31it/s, loss=0.0451, lr=0.0001]\nSteps: 58%|█████▊ | 404/700 [02:55<02:08, 2.31it/s, loss=0.0451, lr=0.0001]\nSteps: 58%|█████▊ | 404/700 [02:55<02:08, 2.31it/s, loss=0.146, lr=0.0001] \nSteps: 58%|█████▊ | 405/700 [02:55<02:07, 2.31it/s, loss=0.146, lr=0.0001]\nSteps: 58%|█████▊ | 405/700 [02:55<02:07, 2.31it/s, loss=0.127, lr=0.0001]\nSteps: 58%|█████▊ | 406/700 [02:56<02:07, 2.31it/s, loss=0.127, lr=0.0001]\nSteps: 58%|█████▊ | 406/700 [02:56<02:07, 2.31it/s, loss=0.11, lr=0.0001] \nSteps: 58%|█████▊ | 407/700 [02:56<02:06, 2.31it/s, loss=0.11, lr=0.0001]\nSteps: 58%|█████▊ | 407/700 [02:56<02:06, 2.31it/s, loss=0.0996, lr=0.0001]\nSteps: 58%|█████▊ | 408/700 [02:57<02:06, 2.31it/s, loss=0.0996, lr=0.0001]\nSteps: 58%|█████▊ | 408/700 [02:57<02:06, 2.31it/s, loss=0.136, lr=0.0001] \nSteps: 58%|█████▊ | 409/700 [02:57<02:06, 2.30it/s, loss=0.136, lr=0.0001]\nSteps: 58%|█████▊ | 409/700 [02:57<02:06, 2.30it/s, loss=0.174, lr=0.0001]\nSteps: 59%|█████▊ | 410/700 [02:57<02:05, 2.30it/s, loss=0.174, lr=0.0001]\nSteps: 59%|█████▊ | 410/700 [02:57<02:05, 2.30it/s, loss=0.106, lr=0.0001]\nSteps: 59%|█████▊ | 411/700 [02:58<02:05, 2.30it/s, loss=0.106, lr=0.0001]\nSteps: 59%|█████▊ | 411/700 [02:58<02:05, 2.30it/s, loss=0.137, lr=0.0001]\nSteps: 59%|█████▉ | 412/700 [02:58<02:04, 2.31it/s, loss=0.137, lr=0.0001]\nSteps: 59%|█████▉ | 412/700 [02:58<02:04, 2.31it/s, loss=0.0351, lr=0.0001]\nSteps: 59%|█████▉ | 413/700 [02:59<02:04, 2.31it/s, loss=0.0351, lr=0.0001]\nSteps: 59%|█████▉ | 413/700 [02:59<02:04, 2.31it/s, loss=0.136, lr=0.0001] \nSteps: 59%|█████▉ | 414/700 [02:59<02:03, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 59%|█████▉ | 414/700 [02:59<02:03, 2.31it/s, loss=0.0681, lr=0.0001]\nSteps: 59%|█████▉ | 415/700 [03:00<02:03, 2.31it/s, loss=0.0681, lr=0.0001]\nSteps: 59%|█████▉ | 415/700 [03:00<02:03, 2.31it/s, loss=0.0218, lr=0.0001]\nSteps: 59%|█████▉ | 416/700 [03:00<02:02, 2.31it/s, loss=0.0218, lr=0.0001]\nSteps: 59%|█████▉ | 416/700 [03:00<02:02, 2.31it/s, loss=0.0585, lr=0.0001]\nSteps: 60%|█████▉ | 417/700 [03:00<02:02, 2.31it/s, loss=0.0585, lr=0.0001]\nSteps: 60%|█████▉ | 417/700 [03:00<02:02, 2.31it/s, loss=0.0662, lr=0.0001]\nSteps: 60%|█████▉ | 418/700 [03:01<02:02, 2.31it/s, loss=0.0662, lr=0.0001]\nSteps: 60%|█████▉ | 418/700 [03:01<02:02, 2.31it/s, loss=0.0406, lr=0.0001]\nSteps: 60%|█████▉ | 419/700 [03:01<02:01, 2.31it/s, loss=0.0406, lr=0.0001]\nSteps: 60%|█████▉ | 419/700 [03:01<02:01, 2.31it/s, loss=0.0997, lr=0.0001]\nSteps: 60%|██████ | 420/700 [03:02<02:01, 2.31it/s, loss=0.0997, lr=0.0001]\nSteps: 60%|██████ | 420/700 [03:02<02:01, 2.31it/s, loss=0.13, lr=0.0001] \nSteps: 60%|██████ | 421/700 [03:02<02:01, 2.30it/s, loss=0.13, lr=0.0001]\nSteps: 60%|██████ | 421/700 [03:02<02:01, 2.30it/s, loss=0.0885, lr=0.0001]\nSteps: 60%|██████ | 422/700 [03:03<02:00, 2.30it/s, loss=0.0885, lr=0.0001]\nSteps: 60%|██████ | 422/700 [03:03<02:00, 2.30it/s, loss=0.0947, lr=0.0001]\nSteps: 60%|██████ | 423/700 [03:03<02:00, 2.31it/s, loss=0.0947, lr=0.0001]\nSteps: 60%|██████ | 423/700 [03:03<02:00, 2.31it/s, loss=0.0685, lr=0.0001]\nSteps: 61%|██████ | 424/700 [03:03<01:59, 2.31it/s, loss=0.0685, lr=0.0001]\nSteps: 61%|██████ | 424/700 [03:03<01:59, 2.31it/s, loss=0.0901, lr=0.0001]\nSteps: 61%|██████ | 425/700 [03:04<01:59, 2.31it/s, loss=0.0901, lr=0.0001]\nSteps: 61%|██████ | 425/700 [03:04<01:59, 2.31it/s, loss=0.233, lr=0.0001] \nSteps: 61%|██████ | 426/700 [03:04<01:58, 2.31it/s, loss=0.233, lr=0.0001]\nSteps: 61%|██████ | 426/700 [03:04<01:58, 2.31it/s, loss=0.0955, lr=0.0001]\nSteps: 61%|██████ | 427/700 [03:05<01:58, 2.31it/s, loss=0.0955, lr=0.0001]\nSteps: 61%|██████ | 427/700 [03:05<01:58, 2.31it/s, loss=0.174, lr=0.0001] \nSteps: 61%|██████ | 428/700 [03:05<01:57, 2.31it/s, loss=0.174, lr=0.0001]\nSteps: 61%|██████ | 428/700 [03:05<01:57, 2.31it/s, loss=0.0519, lr=0.0001]\nSteps: 61%|██████▏ | 429/700 [03:06<01:57, 2.31it/s, loss=0.0519, lr=0.0001]\nSteps: 61%|██████▏ | 429/700 [03:06<01:57, 2.31it/s, loss=0.0831, lr=0.0001]\nSteps: 61%|██████▏ | 430/700 [03:06<01:56, 2.31it/s, loss=0.0831, lr=0.0001]\nSteps: 61%|██████▏ | 430/700 [03:06<01:56, 2.31it/s, loss=0.117, lr=0.0001] \nSteps: 62%|██████▏ | 431/700 [03:06<01:56, 2.31it/s, loss=0.117, lr=0.0001]\nSteps: 62%|██████▏ | 431/700 [03:07<01:56, 2.31it/s, loss=0.149, lr=0.0001]\nSteps: 62%|██████▏ | 432/700 [03:07<01:55, 2.31it/s, loss=0.149, lr=0.0001]\nSteps: 62%|██████▏ | 432/700 [03:07<01:55, 2.31it/s, loss=0.0795, lr=0.0001]\nSteps: 62%|██████▏ | 433/700 [03:07<01:56, 2.30it/s, loss=0.0795, lr=0.0001]\nSteps: 62%|██████▏ | 433/700 [03:07<01:56, 2.30it/s, loss=0.107, lr=0.0001] \nSteps: 62%|██████▏ | 434/700 [03:08<01:55, 2.30it/s, loss=0.107, lr=0.0001]\nSteps: 62%|██████▏ | 434/700 [03:08<01:55, 2.30it/s, loss=0.0928, lr=0.0001]\nSteps: 62%|██████▏ | 435/700 [03:08<01:54, 2.31it/s, loss=0.0928, lr=0.0001]\nSteps: 62%|██████▏ | 435/700 [03:08<01:54, 2.31it/s, loss=0.0883, lr=0.0001]\nSteps: 62%|██████▏ | 436/700 [03:09<01:54, 2.31it/s, loss=0.0883, lr=0.0001]\nSteps: 62%|██████▏ | 436/700 [03:09<01:54, 2.31it/s, loss=0.0757, lr=0.0001]\nSteps: 62%|██████▏ | 437/700 [03:09<01:53, 2.31it/s, loss=0.0757, lr=0.0001]\nSteps: 62%|██████▏ | 437/700 [03:09<01:53, 2.31it/s, loss=0.119, lr=0.0001] \nSteps: 63%|██████▎ | 438/700 [03:10<01:53, 2.31it/s, loss=0.119, lr=0.0001]\nSteps: 63%|██████▎ | 438/700 [03:10<01:53, 2.31it/s, loss=0.126, lr=0.0001]\nSteps: 63%|██████▎ | 439/700 [03:10<01:53, 2.31it/s, loss=0.126, lr=0.0001]\nSteps: 63%|██████▎ | 439/700 [03:10<01:53, 2.31it/s, loss=0.0425, lr=0.0001]\nSteps: 63%|██████▎ | 440/700 [03:10<01:52, 2.31it/s, loss=0.0425, lr=0.0001]\nSteps: 63%|██████▎ | 440/700 [03:10<01:52, 2.31it/s, loss=0.143, lr=0.0001] \nSteps: 63%|██████▎ | 441/700 [03:11<01:52, 2.31it/s, loss=0.143, lr=0.0001]\nSteps: 63%|██████▎ | 441/700 [03:11<01:52, 2.31it/s, loss=0.0846, lr=0.0001]\nSteps: 63%|██████▎ | 442/700 [03:11<01:51, 2.31it/s, loss=0.0846, lr=0.0001]\nSteps: 63%|██████▎ | 442/700 [03:11<01:51, 2.31it/s, loss=0.0804, lr=0.0001]\nSteps: 63%|██████▎ | 443/700 [03:12<01:51, 2.31it/s, loss=0.0804, lr=0.0001]\nSteps: 63%|██████▎ | 443/700 [03:12<01:51, 2.31it/s, loss=0.139, lr=0.0001] \nSteps: 63%|██████▎ | 444/700 [03:12<01:50, 2.31it/s, loss=0.139, lr=0.0001]\nSteps: 63%|██████▎ | 444/700 [03:12<01:50, 2.31it/s, loss=0.115, lr=0.0001]\nSteps: 64%|██████▎ | 445/700 [03:13<01:50, 2.30it/s, loss=0.115, lr=0.0001]\nSteps: 64%|██████▎ | 445/700 [03:13<01:50, 2.30it/s, loss=0.0897, lr=0.0001]\nSteps: 64%|██████▎ | 446/700 [03:13<01:50, 2.30it/s, loss=0.0897, lr=0.0001]\nSteps: 64%|██████▎ | 446/700 [03:13<01:50, 2.30it/s, loss=0.0656, lr=0.0001]\nSteps: 64%|██████▍ | 447/700 [03:13<01:49, 2.31it/s, loss=0.0656, lr=0.0001]\nSteps: 64%|██████▍ | 447/700 [03:13<01:49, 2.31it/s, loss=0.0926, lr=0.0001]\nSteps: 64%|██████▍ | 448/700 [03:14<01:49, 2.31it/s, loss=0.0926, lr=0.0001]\nSteps: 64%|██████▍ | 448/700 [03:14<01:49, 2.31it/s, loss=0.0764, lr=0.0001]\nSteps: 64%|██████▍ | 449/700 [03:14<01:48, 2.31it/s, loss=0.0764, lr=0.0001]\nSteps: 64%|██████▍ | 449/700 [03:14<01:48, 2.31it/s, loss=0.0648, lr=0.0001]\nSteps: 64%|██████▍ | 450/700 [03:15<01:48, 2.31it/s, loss=0.0648, lr=0.0001]\nSteps: 64%|██████▍ | 450/700 [03:15<01:48, 2.31it/s, loss=0.0487, lr=0.0001]\nSteps: 64%|██████▍ | 451/700 [03:15<01:47, 2.31it/s, loss=0.0487, lr=0.0001]\nSteps: 64%|██████▍ | 451/700 [03:15<01:47, 2.31it/s, loss=0.0588, lr=0.0001]\nSteps: 65%|██████▍ | 452/700 [03:16<01:47, 2.31it/s, loss=0.0588, lr=0.0001]\nSteps: 65%|██████▍ | 452/700 [03:16<01:47, 2.31it/s, loss=0.0702, lr=0.0001]\nSteps: 65%|██████▍ | 453/700 [03:16<01:46, 2.31it/s, loss=0.0702, lr=0.0001]\nSteps: 65%|██████▍ | 453/700 [03:16<01:46, 2.31it/s, loss=0.0665, lr=0.0001]\nSteps: 65%|██████▍ | 454/700 [03:16<01:46, 2.31it/s, loss=0.0665, lr=0.0001]\nSteps: 65%|██████▍ | 454/700 [03:16<01:46, 2.31it/s, loss=0.189, lr=0.0001] \nSteps: 65%|██████▌ | 455/700 [03:17<01:46, 2.31it/s, loss=0.189, lr=0.0001]\nSteps: 65%|██████▌ | 455/700 [03:17<01:46, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 65%|██████▌ | 456/700 [03:17<01:45, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 65%|██████▌ | 456/700 [03:17<01:45, 2.31it/s, loss=0.114, lr=0.0001]\nSteps: 65%|██████▌ | 457/700 [03:18<01:45, 2.30it/s, loss=0.114, lr=0.0001]\nSteps: 65%|██████▌ | 457/700 [03:18<01:45, 2.30it/s, loss=0.0849, lr=0.0001]\nSteps: 65%|██████▌ | 458/700 [03:18<01:45, 2.30it/s, loss=0.0849, lr=0.0001]\nSteps: 65%|██████▌ | 458/700 [03:18<01:45, 2.30it/s, loss=0.084, lr=0.0001] \nSteps: 66%|██████▌ | 459/700 [03:19<01:44, 2.31it/s, loss=0.084, lr=0.0001]\nSteps: 66%|██████▌ | 459/700 [03:19<01:44, 2.31it/s, loss=0.165, lr=0.0001]\nSteps: 66%|██████▌ | 460/700 [03:19<01:43, 2.31it/s, loss=0.165, lr=0.0001]\nSteps: 66%|██████▌ | 460/700 [03:19<01:43, 2.31it/s, loss=0.0867, lr=0.0001]\nSteps: 66%|██████▌ | 461/700 [03:19<01:43, 2.31it/s, loss=0.0867, lr=0.0001]\nSteps: 66%|██████▌ | 461/700 [03:20<01:43, 2.31it/s, loss=0.0846, lr=0.0001]\nSteps: 66%|██████▌ | 462/700 [03:20<01:43, 2.31it/s, loss=0.0846, lr=0.0001]\nSteps: 66%|██████▌ | 462/700 [03:20<01:43, 2.31it/s, loss=0.107, lr=0.0001] \nSteps: 66%|██████▌ | 463/700 [03:20<01:42, 2.31it/s, loss=0.107, lr=0.0001]\nSteps: 66%|██████▌ | 463/700 [03:20<01:42, 2.31it/s, loss=0.0725, lr=0.0001]\nSteps: 66%|██████▋ | 464/700 [03:21<01:42, 2.31it/s, loss=0.0725, lr=0.0001]\nSteps: 66%|██████▋ | 464/700 [03:21<01:42, 2.31it/s, loss=0.0726, lr=0.0001]\nSteps: 66%|██████▋ | 465/700 [03:21<01:41, 2.31it/s, loss=0.0726, lr=0.0001]\nSteps: 66%|██████▋ | 465/700 [03:21<01:41, 2.31it/s, loss=0.106, lr=0.0001] \nSteps: 67%|██████▋ | 466/700 [03:22<01:41, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 67%|██████▋ | 466/700 [03:22<01:41, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 67%|██████▋ | 467/700 [03:22<01:40, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 67%|██████▋ | 467/700 [03:22<01:40, 2.31it/s, loss=0.0495, lr=0.0001]\nSteps: 67%|██████▋ | 468/700 [03:23<01:40, 2.31it/s, loss=0.0495, lr=0.0001]\nSteps: 67%|██████▋ | 468/700 [03:23<01:40, 2.31it/s, loss=0.0961, lr=0.0001]\nSteps: 67%|██████▋ | 469/700 [03:23<01:40, 2.30it/s, loss=0.0961, lr=0.0001]\nSteps: 67%|██████▋ | 469/700 [03:23<01:40, 2.30it/s, loss=0.0487, lr=0.0001]\nSteps: 67%|██████▋ | 470/700 [03:23<01:39, 2.30it/s, loss=0.0487, lr=0.0001]\nSteps: 67%|██████▋ | 470/700 [03:23<01:39, 2.30it/s, loss=0.0406, lr=0.0001]\nSteps: 67%|██████▋ | 471/700 [03:24<01:39, 2.31it/s, loss=0.0406, lr=0.0001]\nSteps: 67%|██████▋ | 471/700 [03:24<01:39, 2.31it/s, loss=0.044, lr=0.0001] \nSteps: 67%|██████▋ | 472/700 [03:24<01:38, 2.31it/s, loss=0.044, lr=0.0001]\nSteps: 67%|██████▋ | 472/700 [03:24<01:38, 2.31it/s, loss=0.149, lr=0.0001]\nSteps: 68%|██████▊ | 473/700 [03:25<01:38, 2.31it/s, loss=0.149, lr=0.0001]\nSteps: 68%|██████▊ | 473/700 [03:25<01:38, 2.31it/s, loss=0.0757, lr=0.0001]\nSteps: 68%|██████▊ | 474/700 [03:25<01:37, 2.31it/s, loss=0.0757, lr=0.0001]\nSteps: 68%|██████▊ | 474/700 [03:25<01:37, 2.31it/s, loss=0.0707, lr=0.0001]\nSteps: 68%|██████▊ | 475/700 [03:26<01:37, 2.31it/s, loss=0.0707, lr=0.0001]\nSteps: 68%|██████▊ | 475/700 [03:26<01:37, 2.31it/s, loss=0.0752, lr=0.0001]\nSteps: 68%|██████▊ | 476/700 [03:26<01:36, 2.31it/s, loss=0.0752, lr=0.0001]\nSteps: 68%|██████▊ | 476/700 [03:26<01:36, 2.31it/s, loss=0.0597, lr=0.0001]\nSteps: 68%|██████▊ | 477/700 [03:26<01:36, 2.31it/s, loss=0.0597, lr=0.0001]\nSteps: 68%|██████▊ | 477/700 [03:26<01:36, 2.31it/s, loss=0.078, lr=0.0001] \nSteps: 68%|██████▊ | 478/700 [03:27<01:36, 2.31it/s, loss=0.078, lr=0.0001]\nSteps: 68%|██████▊ | 478/700 [03:27<01:36, 2.31it/s, loss=0.109, lr=0.0001]\nSteps: 68%|██████▊ | 479/700 [03:27<01:35, 2.31it/s, loss=0.109, lr=0.0001]\nSteps: 68%|██████▊ | 479/700 [03:27<01:35, 2.31it/s, loss=0.0575, lr=0.0001]\nSteps: 69%|██████▊ | 480/700 [03:28<01:35, 2.31it/s, loss=0.0575, lr=0.0001]\nSteps: 69%|██████▊ | 480/700 [03:28<01:35, 2.31it/s, loss=0.099, lr=0.0001] \nSteps: 69%|██████▊ | 481/700 [03:28<01:35, 2.30it/s, loss=0.099, lr=0.0001]\nSteps: 69%|██████▊ | 481/700 [03:28<01:35, 2.30it/s, loss=0.119, lr=0.0001]\nSteps: 69%|██████▉ | 482/700 [03:29<01:34, 2.30it/s, loss=0.119, lr=0.0001]\nSteps: 69%|██████▉ | 482/700 [03:29<01:34, 2.30it/s, loss=0.246, lr=0.0001]\nSteps: 69%|██████▉ | 483/700 [03:29<01:34, 2.31it/s, loss=0.246, lr=0.0001]\nSteps: 69%|██████▉ | 483/700 [03:29<01:34, 2.31it/s, loss=0.0938, lr=0.0001]\nSteps: 69%|██████▉ | 484/700 [03:29<01:33, 2.31it/s, loss=0.0938, lr=0.0001]\nSteps: 69%|██████▉ | 484/700 [03:29<01:33, 2.31it/s, loss=0.0895, lr=0.0001]\nSteps: 69%|██████▉ | 485/700 [03:30<01:33, 2.31it/s, loss=0.0895, lr=0.0001]\nSteps: 69%|██████▉ | 485/700 [03:30<01:33, 2.31it/s, loss=0.146, lr=0.0001] \nSteps: 69%|██████▉ | 486/700 [03:30<01:32, 2.31it/s, loss=0.146, lr=0.0001]\nSteps: 69%|██████▉ | 486/700 [03:30<01:32, 2.31it/s, loss=0.0565, lr=0.0001]\nSteps: 70%|██████▉ | 487/700 [03:31<01:32, 2.31it/s, loss=0.0565, lr=0.0001]\nSteps: 70%|██████▉ | 487/700 [03:31<01:32, 2.31it/s, loss=0.142, lr=0.0001] \nSteps: 70%|██████▉ | 488/700 [03:31<01:31, 2.31it/s, loss=0.142, lr=0.0001]\nSteps: 70%|██████▉ | 488/700 [03:31<01:31, 2.31it/s, loss=0.0218, lr=0.0001]\nSteps: 70%|██████▉ | 489/700 [03:32<01:31, 2.31it/s, loss=0.0218, lr=0.0001]\nSteps: 70%|██████▉ | 489/700 [03:32<01:31, 2.31it/s, loss=0.0811, lr=0.0001]\nSteps: 70%|███████ | 490/700 [03:32<01:30, 2.31it/s, loss=0.0811, lr=0.0001]\nSteps: 70%|███████ | 490/700 [03:32<01:30, 2.31it/s, loss=0.0571, lr=0.0001]\nSteps: 70%|███████ | 491/700 [03:32<01:30, 2.31it/s, loss=0.0571, lr=0.0001]\nSteps: 70%|███████ | 491/700 [03:33<01:30, 2.31it/s, loss=0.109, lr=0.0001] \nSteps: 70%|███████ | 492/700 [03:33<01:29, 2.31it/s, loss=0.109, lr=0.0001]\nSteps: 70%|███████ | 492/700 [03:33<01:29, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 70%|███████ | 493/700 [03:33<01:29, 2.30it/s, loss=0.136, lr=0.0001]\nSteps: 70%|███████ | 493/700 [03:33<01:29, 2.30it/s, loss=0.233, lr=0.0001]\nSteps: 71%|███████ | 494/700 [03:34<01:29, 2.30it/s, loss=0.233, lr=0.0001]\nSteps: 71%|███████ | 494/700 [03:34<01:29, 2.30it/s, loss=0.0985, lr=0.0001]\nSteps: 71%|███████ | 495/700 [03:34<01:28, 2.31it/s, loss=0.0985, lr=0.0001]\nSteps: 71%|███████ | 495/700 [03:34<01:28, 2.31it/s, loss=0.0914, lr=0.0001]\nSteps: 71%|███████ | 496/700 [03:35<01:28, 2.31it/s, loss=0.0914, lr=0.0001]\nSteps: 71%|███████ | 496/700 [03:35<01:28, 2.31it/s, loss=0.126, lr=0.0001] \nSteps: 71%|███████ | 497/700 [03:35<01:27, 2.31it/s, loss=0.126, lr=0.0001]\nSteps: 71%|███████ | 497/700 [03:35<01:27, 2.31it/s, loss=0.112, lr=0.0001]\nSteps: 71%|███████ | 498/700 [03:36<01:27, 2.31it/s, loss=0.112, lr=0.0001]\nSteps: 71%|███████ | 498/700 [03:36<01:27, 2.31it/s, loss=0.0553, lr=0.0001]\nSteps: 71%|███████▏ | 499/700 [03:36<01:27, 2.31it/s, loss=0.0553, lr=0.0001]\nSteps: 71%|███████▏ | 499/700 [03:36<01:27, 2.31it/s, loss=0.142, lr=0.0001] \nSteps: 71%|███████▏ | 500/700 [03:36<01:26, 2.31it/s, loss=0.142, lr=0.0001]\nSteps: 71%|███████▏ | 500/700 [03:36<01:26, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 72%|███████▏ | 501/700 [03:37<01:26, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 72%|███████▏ | 501/700 [03:37<01:26, 2.31it/s, loss=0.0563, lr=0.0001]\nSteps: 72%|███████▏ | 502/700 [03:37<01:25, 2.31it/s, loss=0.0563, lr=0.0001]\nSteps: 72%|███████▏ | 502/700 [03:37<01:25, 2.31it/s, loss=0.234, lr=0.0001] \nSteps: 72%|███████▏ | 503/700 [03:38<01:25, 2.31it/s, loss=0.234, lr=0.0001]\nSteps: 72%|███████▏ | 503/700 [03:38<01:25, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 72%|███████▏ | 504/700 [03:38<01:24, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 72%|███████▏ | 504/700 [03:38<01:24, 2.31it/s, loss=0.0646, lr=0.0001]\nSteps: 72%|███████▏ | 505/700 [03:39<01:24, 2.30it/s, loss=0.0646, lr=0.0001]\nSteps: 72%|███████▏ | 505/700 [03:39<01:24, 2.30it/s, loss=0.101, lr=0.0001] \nSteps: 72%|███████▏ | 506/700 [03:39<01:24, 2.30it/s, loss=0.101, lr=0.0001]\nSteps: 72%|███████▏ | 506/700 [03:39<01:24, 2.30it/s, loss=0.0391, lr=0.0001]\nSteps: 72%|███████▏ | 507/700 [03:39<01:23, 2.31it/s, loss=0.0391, lr=0.0001]\nSteps: 72%|███████▏ | 507/700 [03:39<01:23, 2.31it/s, loss=0.0464, lr=0.0001]\nSteps: 73%|███████▎ | 508/700 [03:40<01:23, 2.31it/s, loss=0.0464, lr=0.0001]\nSteps: 73%|███████▎ | 508/700 [03:40<01:23, 2.31it/s, loss=0.0821, lr=0.0001]\nSteps: 73%|███████▎ | 509/700 [03:40<01:22, 2.31it/s, loss=0.0821, lr=0.0001]\nSteps: 73%|███████▎ | 509/700 [03:40<01:22, 2.31it/s, loss=0.191, lr=0.0001] \nSteps: 73%|███████▎ | 510/700 [03:41<01:22, 2.31it/s, loss=0.191, lr=0.0001]\nSteps: 73%|███████▎ | 510/700 [03:41<01:22, 2.31it/s, loss=0.0574, lr=0.0001]\nSteps: 73%|███████▎ | 511/700 [03:41<01:21, 2.31it/s, loss=0.0574, lr=0.0001]\nSteps: 73%|███████▎ | 511/700 [03:41<01:21, 2.31it/s, loss=0.0778, lr=0.0001]\nSteps: 73%|███████▎ | 512/700 [03:42<01:21, 2.31it/s, loss=0.0778, lr=0.0001]\nSteps: 73%|███████▎ | 512/700 [03:42<01:21, 2.31it/s, loss=0.179, lr=0.0001] \nSteps: 73%|███████▎ | 513/700 [03:42<01:20, 2.31it/s, loss=0.179, lr=0.0001]\nSteps: 73%|███████▎ | 513/700 [03:42<01:20, 2.31it/s, loss=0.0893, lr=0.0001]\nSteps: 73%|███████▎ | 514/700 [03:42<01:20, 2.31it/s, loss=0.0893, lr=0.0001]\nSteps: 73%|███████▎ | 514/700 [03:42<01:20, 2.31it/s, loss=0.0585, lr=0.0001]\nSteps: 74%|███████▎ | 515/700 [03:43<01:20, 2.31it/s, loss=0.0585, lr=0.0001]\nSteps: 74%|███████▎ | 515/700 [03:43<01:20, 2.31it/s, loss=0.0622, lr=0.0001]\nSteps: 74%|███████▎ | 516/700 [03:43<01:19, 2.31it/s, loss=0.0622, lr=0.0001]\nSteps: 74%|███████▎ | 516/700 [03:43<01:19, 2.31it/s, loss=0.0993, lr=0.0001]\nSteps: 74%|███████▍ | 517/700 [03:44<01:19, 2.30it/s, loss=0.0993, lr=0.0001]\nSteps: 74%|███████▍ | 517/700 [03:44<01:19, 2.30it/s, loss=0.0807, lr=0.0001]\nSteps: 74%|███████▍ | 518/700 [03:44<01:18, 2.31it/s, loss=0.0807, lr=0.0001]\nSteps: 74%|███████▍ | 518/700 [03:44<01:18, 2.31it/s, loss=0.1, lr=0.0001] \nSteps: 74%|███████▍ | 519/700 [03:45<01:18, 2.31it/s, loss=0.1, lr=0.0001]\nSteps: 74%|███████▍ | 519/700 [03:45<01:18, 2.31it/s, loss=0.0567, lr=0.0001]\nSteps: 74%|███████▍ | 520/700 [03:45<01:18, 2.31it/s, loss=0.0567, lr=0.0001]\nSteps: 74%|███████▍ | 520/700 [03:45<01:18, 2.31it/s, loss=0.163, lr=0.0001] \nSteps: 74%|███████▍ | 521/700 [03:45<01:17, 2.31it/s, loss=0.163, lr=0.0001]\nSteps: 74%|███████▍ | 521/700 [03:45<01:17, 2.31it/s, loss=0.146, lr=0.0001]\nSteps: 75%|███████▍ | 522/700 [03:46<01:17, 2.31it/s, loss=0.146, lr=0.0001]\nSteps: 75%|███████▍ | 522/700 [03:46<01:17, 2.31it/s, loss=0.12, lr=0.0001] \nSteps: 75%|███████▍ | 523/700 [03:46<01:16, 2.31it/s, loss=0.12, lr=0.0001]\nSteps: 75%|███████▍ | 523/700 [03:46<01:16, 2.31it/s, loss=0.181, lr=0.0001]\nSteps: 75%|███████▍ | 524/700 [03:47<01:16, 2.31it/s, loss=0.181, lr=0.0001]\nSteps: 75%|███████▍ | 524/700 [03:47<01:16, 2.31it/s, loss=0.147, lr=0.0001]\nSteps: 75%|███████▌ | 525/700 [03:47<01:15, 2.31it/s, loss=0.147, lr=0.0001]\nSteps: 75%|███████▌ | 525/700 [03:47<01:15, 2.31it/s, loss=0.0812, lr=0.0001]\nSteps: 75%|███████▌ | 526/700 [03:48<01:15, 2.31it/s, loss=0.0812, lr=0.0001]\nSteps: 75%|███████▌ | 526/700 [03:48<01:15, 2.31it/s, loss=0.0597, lr=0.0001]\nSteps: 75%|███████▌ | 527/700 [03:48<01:14, 2.31it/s, loss=0.0597, lr=0.0001]\nSteps: 75%|███████▌ | 527/700 [03:48<01:14, 2.31it/s, loss=0.0656, lr=0.0001]\nSteps: 75%|███████▌ | 528/700 [03:48<01:14, 2.31it/s, loss=0.0656, lr=0.0001]\nSteps: 75%|███████▌ | 528/700 [03:49<01:14, 2.31it/s, loss=0.114, lr=0.0001] \nSteps: 76%|███████▌ | 529/700 [03:49<01:14, 2.30it/s, loss=0.114, lr=0.0001]\nSteps: 76%|███████▌ | 529/700 [03:49<01:14, 2.30it/s, loss=0.0865, lr=0.0001]\nSteps: 76%|███████▌ | 530/700 [03:49<01:13, 2.30it/s, loss=0.0865, lr=0.0001]\nSteps: 76%|███████▌ | 530/700 [03:49<01:13, 2.30it/s, loss=0.0999, lr=0.0001]\nSteps: 76%|███████▌ | 531/700 [03:50<01:13, 2.31it/s, loss=0.0999, lr=0.0001]\nSteps: 76%|███████▌ | 531/700 [03:50<01:13, 2.31it/s, loss=0.142, lr=0.0001] \nSteps: 76%|███████▌ | 532/700 [03:50<01:12, 2.31it/s, loss=0.142, lr=0.0001]\nSteps: 76%|███████▌ | 532/700 [03:50<01:12, 2.31it/s, loss=0.0418, lr=0.0001]\nSteps: 76%|███████▌ | 533/700 [03:51<01:12, 2.31it/s, loss=0.0418, lr=0.0001]\nSteps: 76%|███████▌ | 533/700 [03:51<01:12, 2.31it/s, loss=0.0675, lr=0.0001]\nSteps: 76%|███████▋ | 534/700 [03:51<01:11, 2.31it/s, loss=0.0675, lr=0.0001]\nSteps: 76%|███████▋ | 534/700 [03:51<01:11, 2.31it/s, loss=0.051, lr=0.0001] \nSteps: 76%|███████▋ | 535/700 [03:52<01:11, 2.31it/s, loss=0.051, lr=0.0001]\nSteps: 76%|███████▋ | 535/700 [03:52<01:11, 2.31it/s, loss=0.131, lr=0.0001]\nSteps: 77%|███████▋ | 536/700 [03:52<01:11, 2.31it/s, loss=0.131, lr=0.0001]\nSteps: 77%|███████▋ | 536/700 [03:52<01:11, 2.31it/s, loss=0.0786, lr=0.0001]\nSteps: 77%|███████▋ | 537/700 [03:52<01:10, 2.31it/s, loss=0.0786, lr=0.0001]\nSteps: 77%|███████▋ | 537/700 [03:52<01:10, 2.31it/s, loss=0.122, lr=0.0001] \nSteps: 77%|███████▋ | 538/700 [03:53<01:10, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 77%|███████▋ | 538/700 [03:53<01:10, 2.31it/s, loss=0.0734, lr=0.0001]\nSteps: 77%|███████▋ | 539/700 [03:53<01:09, 2.31it/s, loss=0.0734, lr=0.0001]\nSteps: 77%|███████▋ | 539/700 [03:53<01:09, 2.31it/s, loss=0.0796, lr=0.0001]\nSteps: 77%|███████▋ | 540/700 [03:54<01:09, 2.31it/s, loss=0.0796, lr=0.0001]\nSteps: 77%|███████▋ | 540/700 [03:54<01:09, 2.31it/s, loss=0.0497, lr=0.0001]\nSteps: 77%|███████▋ | 541/700 [03:54<01:09, 2.30it/s, loss=0.0497, lr=0.0001]\nSteps: 77%|███████▋ | 541/700 [03:54<01:09, 2.30it/s, loss=0.14, lr=0.0001] \nSteps: 77%|███████▋ | 542/700 [03:55<01:08, 2.30it/s, loss=0.14, lr=0.0001]\nSteps: 77%|███████▋ | 542/700 [03:55<01:08, 2.30it/s, loss=0.059, lr=0.0001]\nSteps: 78%|███████▊ | 543/700 [03:55<01:08, 2.30it/s, loss=0.059, lr=0.0001]\nSteps: 78%|███████▊ | 543/700 [03:55<01:08, 2.30it/s, loss=0.0413, lr=0.0001]\nSteps: 78%|███████▊ | 544/700 [03:55<01:07, 2.30it/s, loss=0.0413, lr=0.0001]\nSteps: 78%|███████▊ | 544/700 [03:55<01:07, 2.30it/s, loss=0.0563, lr=0.0001]\nSteps: 78%|███████▊ | 545/700 [03:56<01:07, 2.31it/s, loss=0.0563, lr=0.0001]\nSteps: 78%|███████▊ | 545/700 [03:56<01:07, 2.31it/s, loss=0.0928, lr=0.0001]\nSteps: 78%|███████▊ | 546/700 [03:56<01:06, 2.31it/s, loss=0.0928, lr=0.0001]\nSteps: 78%|███████▊ | 546/700 [03:56<01:06, 2.31it/s, loss=0.121, lr=0.0001] \nSteps: 78%|███████▊ | 547/700 [03:57<01:06, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 78%|███████▊ | 547/700 [03:57<01:06, 2.31it/s, loss=0.107, lr=0.0001]\nSteps: 78%|███████▊ | 548/700 [03:57<01:05, 2.31it/s, loss=0.107, lr=0.0001]\nSteps: 78%|███████▊ | 548/700 [03:57<01:05, 2.31it/s, loss=0.11, lr=0.0001] \nSteps: 78%|███████▊ | 549/700 [03:58<01:05, 2.31it/s, loss=0.11, lr=0.0001]\nSteps: 78%|███████▊ | 549/700 [03:58<01:05, 2.31it/s, loss=0.0758, lr=0.0001]\nSteps: 79%|███████▊ | 550/700 [03:58<01:04, 2.31it/s, loss=0.0758, lr=0.0001]\nSteps: 79%|███████▊ | 550/700 [03:58<01:04, 2.31it/s, loss=0.0922, lr=0.0001]\nSteps: 79%|███████▊ | 551/700 [03:58<01:04, 2.31it/s, loss=0.0922, lr=0.0001]\nSteps: 79%|███████▊ | 551/700 [03:58<01:04, 2.31it/s, loss=0.0692, lr=0.0001]\nSteps: 79%|███████▉ | 552/700 [03:59<01:04, 2.31it/s, loss=0.0692, lr=0.0001]\nSteps: 79%|███████▉ | 552/700 [03:59<01:04, 2.31it/s, loss=0.0917, lr=0.0001]\nSteps: 79%|███████▉ | 553/700 [03:59<01:03, 2.30it/s, loss=0.0917, lr=0.0001]\nSteps: 79%|███████▉ | 553/700 [03:59<01:03, 2.30it/s, loss=0.0807, lr=0.0001]\nSteps: 79%|███████▉ | 554/700 [04:00<01:03, 2.30it/s, loss=0.0807, lr=0.0001]\nSteps: 79%|███████▉ | 554/700 [04:00<01:03, 2.30it/s, loss=0.0807, lr=0.0001]\nSteps: 79%|███████▉ | 555/700 [04:00<01:02, 2.31it/s, loss=0.0807, lr=0.0001]\nSteps: 79%|███████▉ | 555/700 [04:00<01:02, 2.31it/s, loss=0.121, lr=0.0001] \nSteps: 79%|███████▉ | 556/700 [04:01<01:02, 2.31it/s, loss=0.121, lr=0.0001]\nSteps: 79%|███████▉ | 556/700 [04:01<01:02, 2.31it/s, loss=0.0876, lr=0.0001]\nSteps: 80%|███████▉ | 557/700 [04:01<01:01, 2.31it/s, loss=0.0876, lr=0.0001]\nSteps: 80%|███████▉ | 557/700 [04:01<01:01, 2.31it/s, loss=0.114, lr=0.0001] \nSteps: 80%|███████▉ | 558/700 [04:01<01:01, 2.31it/s, loss=0.114, lr=0.0001]\nSteps: 80%|███████▉ | 558/700 [04:02<01:01, 2.31it/s, loss=0.0979, lr=0.0001]\nSteps: 80%|███████▉ | 559/700 [04:02<01:01, 2.31it/s, loss=0.0979, lr=0.0001]\nSteps: 80%|███████▉ | 559/700 [04:02<01:01, 2.31it/s, loss=0.0651, lr=0.0001]\nSteps: 80%|████████ | 560/700 [04:02<01:00, 2.31it/s, loss=0.0651, lr=0.0001]\nSteps: 80%|████████ | 560/700 [04:02<01:00, 2.31it/s, loss=0.064, lr=0.0001] \nSteps: 80%|████████ | 561/700 [04:03<01:00, 2.31it/s, loss=0.064, lr=0.0001]\nSteps: 80%|████████ | 561/700 [04:03<01:00, 2.31it/s, loss=0.137, lr=0.0001]\nSteps: 80%|████████ | 562/700 [04:03<00:59, 2.31it/s, loss=0.137, lr=0.0001]\nSteps: 80%|████████ | 562/700 [04:03<00:59, 2.31it/s, loss=0.0395, lr=0.0001]\nSteps: 80%|████████ | 563/700 [04:04<00:59, 2.31it/s, loss=0.0395, lr=0.0001]\nSteps: 80%|████████ | 563/700 [04:04<00:59, 2.31it/s, loss=0.107, lr=0.0001] \nSteps: 81%|████████ | 564/700 [04:04<00:58, 2.31it/s, loss=0.107, lr=0.0001]\nSteps: 81%|████████ | 564/700 [04:04<00:58, 2.31it/s, loss=0.0575, lr=0.0001]\nSteps: 81%|████████ | 565/700 [04:05<00:58, 2.29it/s, loss=0.0575, lr=0.0001]\nSteps: 81%|████████ | 565/700 [04:05<00:58, 2.29it/s, loss=0.0622, lr=0.0001]\nSteps: 81%|████████ | 566/700 [04:05<00:58, 2.30it/s, loss=0.0622, lr=0.0001]\nSteps: 81%|████████ | 566/700 [04:05<00:58, 2.30it/s, loss=0.0854, lr=0.0001]\nSteps: 81%|████████ | 567/700 [04:05<00:57, 2.31it/s, loss=0.0854, lr=0.0001]\nSteps: 81%|████████ | 567/700 [04:05<00:57, 2.31it/s, loss=0.0195, lr=0.0001]\nSteps: 81%|████████ | 568/700 [04:06<00:57, 2.31it/s, loss=0.0195, lr=0.0001]\nSteps: 81%|████████ | 568/700 [04:06<00:57, 2.31it/s, loss=0.105, lr=0.0001] \nSteps: 81%|████████▏ | 569/700 [04:06<00:56, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 81%|████████▏ | 569/700 [04:06<00:56, 2.31it/s, loss=0.11, lr=0.0001] \nSteps: 81%|████████▏ | 570/700 [04:07<00:56, 2.31it/s, loss=0.11, lr=0.0001]\nSteps: 81%|████████▏ | 570/700 [04:07<00:56, 2.31it/s, loss=0.0211, lr=0.0001]\nSteps: 82%|████████▏ | 571/700 [04:07<00:55, 2.31it/s, loss=0.0211, lr=0.0001]\nSteps: 82%|████████▏ | 571/700 [04:07<00:55, 2.31it/s, loss=0.0886, lr=0.0001]\nSteps: 82%|████████▏ | 572/700 [04:08<00:55, 2.31it/s, loss=0.0886, lr=0.0001]\nSteps: 82%|████████▏ | 572/700 [04:08<00:55, 2.31it/s, loss=0.103, lr=0.0001] \nSteps: 82%|████████▏ | 573/700 [04:08<00:54, 2.31it/s, loss=0.103, lr=0.0001]\nSteps: 82%|████████▏ | 573/700 [04:08<00:54, 2.31it/s, loss=0.0681, lr=0.0001]\nSteps: 82%|████████▏ | 574/700 [04:08<00:54, 2.31it/s, loss=0.0681, lr=0.0001]\nSteps: 82%|████████▏ | 574/700 [04:08<00:54, 2.31it/s, loss=0.0704, lr=0.0001]\nSteps: 82%|████████▏ | 575/700 [04:09<00:54, 2.31it/s, loss=0.0704, lr=0.0001]\nSteps: 82%|████████▏ | 575/700 [04:09<00:54, 2.31it/s, loss=0.044, lr=0.0001] \nSteps: 82%|████████▏ | 576/700 [04:09<00:53, 2.31it/s, loss=0.044, lr=0.0001]\nSteps: 82%|████████▏ | 576/700 [04:09<00:53, 2.31it/s, loss=0.0852, lr=0.0001]\nSteps: 82%|████████▏ | 577/700 [04:10<00:53, 2.30it/s, loss=0.0852, lr=0.0001]\nSteps: 82%|████████▏ | 577/700 [04:10<00:53, 2.30it/s, loss=0.176, lr=0.0001] \nSteps: 83%|████████▎ | 578/700 [04:10<00:52, 2.30it/s, loss=0.176, lr=0.0001]\nSteps: 83%|████████▎ | 578/700 [04:10<00:52, 2.30it/s, loss=0.0449, lr=0.0001]\nSteps: 83%|████████▎ | 579/700 [04:11<00:52, 2.30it/s, loss=0.0449, lr=0.0001]\nSteps: 83%|████████▎ | 579/700 [04:11<00:52, 2.30it/s, loss=0.0719, lr=0.0001]\nSteps: 83%|████████▎ | 580/700 [04:11<00:52, 2.31it/s, loss=0.0719, lr=0.0001]\nSteps: 83%|████████▎ | 580/700 [04:11<00:52, 2.31it/s, loss=0.0621, lr=0.0001]\nSteps: 83%|████████▎ | 581/700 [04:11<00:51, 2.31it/s, loss=0.0621, lr=0.0001]\nSteps: 83%|████████▎ | 581/700 [04:11<00:51, 2.31it/s, loss=0.106, lr=0.0001] \nSteps: 83%|████████▎ | 582/700 [04:12<00:51, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 83%|████████▎ | 582/700 [04:12<00:51, 2.31it/s, loss=0.057, lr=0.0001]\nSteps: 83%|████████▎ | 583/700 [04:12<00:50, 2.31it/s, loss=0.057, lr=0.0001]\nSteps: 83%|████████▎ | 583/700 [04:12<00:50, 2.31it/s, loss=0.0693, lr=0.0001]\nSteps: 83%|████████▎ | 584/700 [04:13<00:50, 2.31it/s, loss=0.0693, lr=0.0001]\nSteps: 83%|████████▎ | 584/700 [04:13<00:50, 2.31it/s, loss=0.0972, lr=0.0001]\nSteps: 84%|████████▎ | 585/700 [04:13<00:49, 2.31it/s, loss=0.0972, lr=0.0001]\nSteps: 84%|████████▎ | 585/700 [04:13<00:49, 2.31it/s, loss=0.0737, lr=0.0001]\nSteps: 84%|████████▎ | 586/700 [04:14<00:49, 2.31it/s, loss=0.0737, lr=0.0001]\nSteps: 84%|████████▎ | 586/700 [04:14<00:49, 2.31it/s, loss=0.099, lr=0.0001] \nSteps: 84%|████████▍ | 587/700 [04:14<00:48, 2.31it/s, loss=0.099, lr=0.0001]\nSteps: 84%|████████▍ | 587/700 [04:14<00:48, 2.31it/s, loss=0.0862, lr=0.0001]\nSteps: 84%|████████▍ | 588/700 [04:14<00:48, 2.31it/s, loss=0.0862, lr=0.0001]\nSteps: 84%|████████▍ | 588/700 [04:15<00:48, 2.31it/s, loss=0.0784, lr=0.0001]\nSteps: 84%|████████▍ | 589/700 [04:15<00:48, 2.30it/s, loss=0.0784, lr=0.0001]\nSteps: 84%|████████▍ | 589/700 [04:15<00:48, 2.30it/s, loss=0.0483, lr=0.0001]\nSteps: 84%|████████▍ | 590/700 [04:15<00:47, 2.30it/s, loss=0.0483, lr=0.0001]\nSteps: 84%|████████▍ | 590/700 [04:15<00:47, 2.30it/s, loss=0.051, lr=0.0001] \nSteps: 84%|████████▍ | 591/700 [04:16<00:47, 2.31it/s, loss=0.051, lr=0.0001]\nSteps: 84%|████████▍ | 591/700 [04:16<00:47, 2.31it/s, loss=0.171, lr=0.0001]\nSteps: 85%|████████▍ | 592/700 [04:16<00:46, 2.31it/s, loss=0.171, lr=0.0001]\nSteps: 85%|████████▍ | 592/700 [04:16<00:46, 2.31it/s, loss=0.0579, lr=0.0001]\nSteps: 85%|████████▍ | 593/700 [04:17<00:46, 2.31it/s, loss=0.0579, lr=0.0001]\nSteps: 85%|████████▍ | 593/700 [04:17<00:46, 2.31it/s, loss=0.174, lr=0.0001] \nSteps: 85%|████████▍ | 594/700 [04:17<00:45, 2.31it/s, loss=0.174, lr=0.0001]\nSteps: 85%|████████▍ | 594/700 [04:17<00:45, 2.31it/s, loss=0.0611, lr=0.0001]\nSteps: 85%|████████▌ | 595/700 [04:18<00:45, 2.31it/s, loss=0.0611, lr=0.0001]\nSteps: 85%|████████▌ | 595/700 [04:18<00:45, 2.31it/s, loss=0.0749, lr=0.0001]\nSteps: 85%|████████▌ | 596/700 [04:18<00:45, 2.31it/s, loss=0.0749, lr=0.0001]\nSteps: 85%|████████▌ | 596/700 [04:18<00:45, 2.31it/s, loss=0.108, lr=0.0001] \nSteps: 85%|████████▌ | 597/700 [04:18<00:44, 2.31it/s, loss=0.108, lr=0.0001]\nSteps: 85%|████████▌ | 597/700 [04:18<00:44, 2.31it/s, loss=0.0268, lr=0.0001]\nSteps: 85%|████████▌ | 598/700 [04:19<00:44, 2.31it/s, loss=0.0268, lr=0.0001]\nSteps: 85%|████████▌ | 598/700 [04:19<00:44, 2.31it/s, loss=0.11, lr=0.0001] \nSteps: 86%|████████▌ | 599/700 [04:19<00:43, 2.31it/s, loss=0.11, lr=0.0001]\nSteps: 86%|████████▌ | 599/700 [04:19<00:43, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 86%|████████▌ | 600/700 [04:20<00:43, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 86%|████████▌ | 600/700 [04:20<00:43, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 86%|████████▌ | 601/700 [04:20<00:43, 2.30it/s, loss=0.129, lr=0.0001]\nSteps: 86%|████████▌ | 601/700 [04:20<00:43, 2.30it/s, loss=0.0724, lr=0.0001]\nSteps: 86%|████████▌ | 602/700 [04:21<00:42, 2.30it/s, loss=0.0724, lr=0.0001]\nSteps: 86%|████████▌ | 602/700 [04:21<00:42, 2.30it/s, loss=0.0995, lr=0.0001]\nSteps: 86%|████████▌ | 603/700 [04:21<00:42, 2.30it/s, loss=0.0995, lr=0.0001]\nSteps: 86%|████████▌ | 603/700 [04:21<00:42, 2.30it/s, loss=0.138, lr=0.0001] \nSteps: 86%|████████▋ | 604/700 [04:21<00:41, 2.31it/s, loss=0.138, lr=0.0001]\nSteps: 86%|████████▋ | 604/700 [04:21<00:41, 2.31it/s, loss=0.173, lr=0.0001]\nSteps: 86%|████████▋ | 605/700 [04:22<00:41, 2.31it/s, loss=0.173, lr=0.0001]\nSteps: 86%|████████▋ | 605/700 [04:22<00:41, 2.31it/s, loss=0.0835, lr=0.0001]\nSteps: 87%|████████▋ | 606/700 [04:22<00:40, 2.31it/s, loss=0.0835, lr=0.0001]\nSteps: 87%|████████▋ | 606/700 [04:22<00:40, 2.31it/s, loss=0.0355, lr=0.0001]\nSteps: 87%|████████▋ | 607/700 [04:23<00:40, 2.31it/s, loss=0.0355, lr=0.0001]\nSteps: 87%|████████▋ | 607/700 [04:23<00:40, 2.31it/s, loss=0.08, lr=0.0001] \nSteps: 87%|████████▋ | 608/700 [04:23<00:39, 2.31it/s, loss=0.08, lr=0.0001]\nSteps: 87%|████████▋ | 608/700 [04:23<00:39, 2.31it/s, loss=0.0755, lr=0.0001]\nSteps: 87%|████████▋ | 609/700 [04:24<00:39, 2.31it/s, loss=0.0755, lr=0.0001]\nSteps: 87%|████████▋ | 609/700 [04:24<00:39, 2.31it/s, loss=0.0997, lr=0.0001]\nSteps: 87%|████████▋ | 610/700 [04:24<00:39, 2.31it/s, loss=0.0997, lr=0.0001]\nSteps: 87%|████████▋ | 610/700 [04:24<00:39, 2.31it/s, loss=0.0443, lr=0.0001]\nSteps: 87%|████████▋ | 611/700 [04:24<00:38, 2.31it/s, loss=0.0443, lr=0.0001]\nSteps: 87%|████████▋ | 611/700 [04:24<00:38, 2.31it/s, loss=0.0704, lr=0.0001]\nSteps: 87%|████████▋ | 612/700 [04:25<00:38, 2.31it/s, loss=0.0704, lr=0.0001]\nSteps: 87%|████████▋ | 612/700 [04:25<00:38, 2.31it/s, loss=0.175, lr=0.0001] \nSteps: 88%|████████▊ | 613/700 [04:25<00:37, 2.30it/s, loss=0.175, lr=0.0001]\nSteps: 88%|████████▊ | 613/700 [04:25<00:37, 2.30it/s, loss=0.0591, lr=0.0001]\nSteps: 88%|████████▊ | 614/700 [04:26<00:37, 2.30it/s, loss=0.0591, lr=0.0001]\nSteps: 88%|████████▊ | 614/700 [04:26<00:37, 2.30it/s, loss=0.0502, lr=0.0001]\nSteps: 88%|████████▊ | 615/700 [04:26<00:36, 2.31it/s, loss=0.0502, lr=0.0001]\nSteps: 88%|████████▊ | 615/700 [04:26<00:36, 2.31it/s, loss=0.0879, lr=0.0001]\nSteps: 88%|████████▊ | 616/700 [04:27<00:36, 2.31it/s, loss=0.0879, lr=0.0001]\nSteps: 88%|████████▊ | 616/700 [04:27<00:36, 2.31it/s, loss=0.134, lr=0.0001] \nSteps: 88%|████████▊ | 617/700 [04:27<00:35, 2.31it/s, loss=0.134, lr=0.0001]\nSteps: 88%|████████▊ | 617/700 [04:27<00:35, 2.31it/s, loss=0.0696, lr=0.0001]\nSteps: 88%|████████▊ | 618/700 [04:27<00:35, 2.31it/s, loss=0.0696, lr=0.0001]\nSteps: 88%|████████▊ | 618/700 [04:28<00:35, 2.31it/s, loss=0.0538, lr=0.0001]\nSteps: 88%|████████▊ | 619/700 [04:28<00:35, 2.31it/s, loss=0.0538, lr=0.0001]\nSteps: 88%|████████▊ | 619/700 [04:28<00:35, 2.31it/s, loss=0.112, lr=0.0001] \nSteps: 89%|████████▊ | 620/700 [04:28<00:34, 2.31it/s, loss=0.112, lr=0.0001]\nSteps: 89%|████████▊ | 620/700 [04:28<00:34, 2.31it/s, loss=0.0917, lr=0.0001]\nSteps: 89%|████████▊ | 621/700 [04:29<00:34, 2.31it/s, loss=0.0917, lr=0.0001]\nSteps: 89%|████████▊ | 621/700 [04:29<00:34, 2.31it/s, loss=0.114, lr=0.0001] \nSteps: 89%|████████▉ | 622/700 [04:29<00:33, 2.31it/s, loss=0.114, lr=0.0001]\nSteps: 89%|████████▉ | 622/700 [04:29<00:33, 2.31it/s, loss=0.0821, lr=0.0001]\nSteps: 89%|████████▉ | 623/700 [04:30<00:33, 2.31it/s, loss=0.0821, lr=0.0001]\nSteps: 89%|████████▉ | 623/700 [04:30<00:33, 2.31it/s, loss=0.0445, lr=0.0001]\nSteps: 89%|████████▉ | 624/700 [04:30<00:32, 2.32it/s, loss=0.0445, lr=0.0001]\nSteps: 89%|████████▉ | 624/700 [04:30<00:32, 2.32it/s, loss=0.0757, lr=0.0001]\nSteps: 89%|████████▉ | 625/700 [04:31<00:32, 2.30it/s, loss=0.0757, lr=0.0001]\nSteps: 89%|████████▉ | 625/700 [04:31<00:32, 2.30it/s, loss=0.0623, lr=0.0001]\nSteps: 89%|████████▉ | 626/700 [04:31<00:32, 2.31it/s, loss=0.0623, lr=0.0001]\nSteps: 89%|████████▉ | 626/700 [04:31<00:32, 2.31it/s, loss=0.129, lr=0.0001] \nSteps: 90%|████████▉ | 627/700 [04:31<00:31, 2.31it/s, loss=0.129, lr=0.0001]\nSteps: 90%|████████▉ | 627/700 [04:31<00:31, 2.31it/s, loss=0.0883, lr=0.0001]\nSteps: 90%|████████▉ | 628/700 [04:32<00:31, 2.31it/s, loss=0.0883, lr=0.0001]\nSteps: 90%|████████▉ | 628/700 [04:32<00:31, 2.31it/s, loss=0.0954, lr=0.0001]\nSteps: 90%|████████▉ | 629/700 [04:32<00:30, 2.31it/s, loss=0.0954, lr=0.0001]\nSteps: 90%|████████▉ | 629/700 [04:32<00:30, 2.31it/s, loss=0.174, lr=0.0001] \nSteps: 90%|█████████ | 630/700 [04:33<00:30, 2.31it/s, loss=0.174, lr=0.0001]\nSteps: 90%|█████████ | 630/700 [04:33<00:30, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 90%|█████████ | 631/700 [04:33<00:29, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 90%|█████████ | 631/700 [04:33<00:29, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 90%|█████████ | 632/700 [04:34<00:29, 2.31it/s, loss=0.104, lr=0.0001]\nSteps: 90%|█████████ | 632/700 [04:34<00:29, 2.31it/s, loss=0.12, lr=0.0001] \nSteps: 90%|█████████ | 633/700 [04:34<00:28, 2.31it/s, loss=0.12, lr=0.0001]\nSteps: 90%|█████████ | 633/700 [04:34<00:28, 2.31it/s, loss=0.0687, lr=0.0001]\nSteps: 91%|█████████ | 634/700 [04:34<00:28, 2.31it/s, loss=0.0687, lr=0.0001]\nSteps: 91%|█████████ | 634/700 [04:34<00:28, 2.31it/s, loss=0.153, lr=0.0001] \nSteps: 91%|█████████ | 635/700 [04:35<00:28, 2.31it/s, loss=0.153, lr=0.0001]\nSteps: 91%|█████████ | 635/700 [04:35<00:28, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 91%|█████████ | 636/700 [04:35<00:27, 2.31it/s, loss=0.105, lr=0.0001]\nSteps: 91%|█████████ | 636/700 [04:35<00:27, 2.31it/s, loss=0.0692, lr=0.0001]\nSteps: 91%|█████████ | 637/700 [04:36<00:27, 2.30it/s, loss=0.0692, lr=0.0001]\nSteps: 91%|█████████ | 637/700 [04:36<00:27, 2.30it/s, loss=0.101, lr=0.0001] \nSteps: 91%|█████████ | 638/700 [04:36<00:26, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 91%|█████████ | 638/700 [04:36<00:26, 2.31it/s, loss=0.0891, lr=0.0001]\nSteps: 91%|█████████▏| 639/700 [04:37<00:26, 2.31it/s, loss=0.0891, lr=0.0001]\nSteps: 91%|█████████▏| 639/700 [04:37<00:26, 2.31it/s, loss=0.201, lr=0.0001] \nSteps: 91%|█████████▏| 640/700 [04:37<00:25, 2.31it/s, loss=0.201, lr=0.0001]\nSteps: 91%|█████████▏| 640/700 [04:37<00:25, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 92%|█████████▏| 641/700 [04:37<00:25, 2.31it/s, loss=0.136, lr=0.0001]\nSteps: 92%|█████████▏| 641/700 [04:37<00:25, 2.31it/s, loss=0.0821, lr=0.0001]\nSteps: 92%|█████████▏| 642/700 [04:38<00:25, 2.31it/s, loss=0.0821, lr=0.0001]\nSteps: 92%|█████████▏| 642/700 [04:38<00:25, 2.31it/s, loss=0.18, lr=0.0001] \nSteps: 92%|█████████▏| 643/700 [04:38<00:24, 2.31it/s, loss=0.18, lr=0.0001]\nSteps: 92%|█████████▏| 643/700 [04:38<00:24, 2.31it/s, loss=0.0533, lr=0.0001]\nSteps: 92%|█████████▏| 644/700 [04:39<00:24, 2.31it/s, loss=0.0533, lr=0.0001]\nSteps: 92%|█████████▏| 644/700 [04:39<00:24, 2.31it/s, loss=0.0746, lr=0.0001]\nSteps: 92%|█████████▏| 645/700 [04:39<00:23, 2.31it/s, loss=0.0746, lr=0.0001]\nSteps: 92%|█████████▏| 645/700 [04:39<00:23, 2.31it/s, loss=0.0947, lr=0.0001]\nSteps: 92%|█████████▏| 646/700 [04:40<00:23, 2.31it/s, loss=0.0947, lr=0.0001]\nSteps: 92%|█████████▏| 646/700 [04:40<00:23, 2.31it/s, loss=0.0792, lr=0.0001]\nSteps: 92%|█████████▏| 647/700 [04:40<00:22, 2.31it/s, loss=0.0792, lr=0.0001]\nSteps: 92%|█████████▏| 647/700 [04:40<00:22, 2.31it/s, loss=0.0432, lr=0.0001]\nSteps: 93%|█████████▎| 648/700 [04:40<00:22, 2.31it/s, loss=0.0432, lr=0.0001]\nSteps: 93%|█████████▎| 648/700 [04:41<00:22, 2.31it/s, loss=0.105, lr=0.0001] \nSteps: 93%|█████████▎| 649/700 [04:41<00:22, 2.30it/s, loss=0.105, lr=0.0001]\nSteps: 93%|█████████▎| 649/700 [04:41<00:22, 2.30it/s, loss=0.135, lr=0.0001]\nSteps: 93%|█████████▎| 650/700 [04:41<00:21, 2.31it/s, loss=0.135, lr=0.0001]\nSteps: 93%|█████████▎| 650/700 [04:41<00:21, 2.31it/s, loss=0.214, lr=0.0001]\nSteps: 93%|█████████▎| 651/700 [04:42<00:21, 2.30it/s, loss=0.214, lr=0.0001]\nSteps: 93%|█████████▎| 651/700 [04:42<00:21, 2.30it/s, loss=0.108, lr=0.0001]\nSteps: 93%|█████████▎| 652/700 [04:42<00:20, 2.31it/s, loss=0.108, lr=0.0001]\nSteps: 93%|█████████▎| 652/700 [04:42<00:20, 2.31it/s, loss=0.0568, lr=0.0001]\nSteps: 93%|█████████▎| 653/700 [04:43<00:20, 2.31it/s, loss=0.0568, lr=0.0001]\nSteps: 93%|█████████▎| 653/700 [04:43<00:20, 2.31it/s, loss=0.131, lr=0.0001] \nSteps: 93%|█████████▎| 654/700 [04:43<00:19, 2.31it/s, loss=0.131, lr=0.0001]\nSteps: 93%|█████████▎| 654/700 [04:43<00:19, 2.31it/s, loss=0.143, lr=0.0001]\nSteps: 94%|█████████▎| 655/700 [04:44<00:19, 2.31it/s, loss=0.143, lr=0.0001]\nSteps: 94%|█████████▎| 655/700 [04:44<00:19, 2.31it/s, loss=0.15, lr=0.0001] \nSteps: 94%|█████████▎| 656/700 [04:44<00:19, 2.31it/s, loss=0.15, lr=0.0001]\nSteps: 94%|█████████▎| 656/700 [04:44<00:19, 2.31it/s, loss=0.0982, lr=0.0001]\nSteps: 94%|█████████▍| 657/700 [04:44<00:18, 2.31it/s, loss=0.0982, lr=0.0001]\nSteps: 94%|█████████▍| 657/700 [04:44<00:18, 2.31it/s, loss=0.0432, lr=0.0001]\nSteps: 94%|█████████▍| 658/700 [04:45<00:18, 2.31it/s, loss=0.0432, lr=0.0001]\nSteps: 94%|█████████▍| 658/700 [04:45<00:18, 2.31it/s, loss=0.116, lr=0.0001] \nSteps: 94%|█████████▍| 659/700 [04:45<00:17, 2.31it/s, loss=0.116, lr=0.0001]\nSteps: 94%|█████████▍| 659/700 [04:45<00:17, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 94%|█████████▍| 660/700 [04:46<00:17, 2.31it/s, loss=0.111, lr=0.0001]\nSteps: 94%|█████████▍| 660/700 [04:46<00:17, 2.31it/s, loss=0.0972, lr=0.0001]\nSteps: 94%|█████████▍| 661/700 [04:46<00:16, 2.30it/s, loss=0.0972, lr=0.0001]\nSteps: 94%|█████████▍| 661/700 [04:46<00:16, 2.30it/s, loss=0.0867, lr=0.0001]\nSteps: 95%|█████████▍| 662/700 [04:47<00:16, 2.31it/s, loss=0.0867, lr=0.0001]\nSteps: 95%|█████████▍| 662/700 [04:47<00:16, 2.31it/s, loss=0.177, lr=0.0001] \nSteps: 95%|█████████▍| 663/700 [04:47<00:16, 2.31it/s, loss=0.177, lr=0.0001]\nSteps: 95%|█████████▍| 663/700 [04:47<00:16, 2.31it/s, loss=0.158, lr=0.0001]\nSteps: 95%|█████████▍| 664/700 [04:47<00:15, 2.31it/s, loss=0.158, lr=0.0001]\nSteps: 95%|█████████▍| 664/700 [04:47<00:15, 2.31it/s, loss=0.185, lr=0.0001]\nSteps: 95%|█████████▌| 665/700 [04:48<00:15, 2.31it/s, loss=0.185, lr=0.0001]\nSteps: 95%|█████████▌| 665/700 [04:48<00:15, 2.31it/s, loss=0.0858, lr=0.0001]\nSteps: 95%|█████████▌| 666/700 [04:48<00:14, 2.31it/s, loss=0.0858, lr=0.0001]\nSteps: 95%|█████████▌| 666/700 [04:48<00:14, 2.31it/s, loss=0.137, lr=0.0001] \nSteps: 95%|█████████▌| 667/700 [04:49<00:14, 2.31it/s, loss=0.137, lr=0.0001]\nSteps: 95%|█████████▌| 667/700 [04:49<00:14, 2.31it/s, loss=0.0444, lr=0.0001]\nSteps: 95%|█████████▌| 668/700 [04:49<00:13, 2.31it/s, loss=0.0444, lr=0.0001]\nSteps: 95%|█████████▌| 668/700 [04:49<00:13, 2.31it/s, loss=0.106, lr=0.0001] \nSteps: 96%|█████████▌| 669/700 [04:50<00:13, 2.31it/s, loss=0.106, lr=0.0001]\nSteps: 96%|█████████▌| 669/700 [04:50<00:13, 2.31it/s, loss=0.0327, lr=0.0001]\nSteps: 96%|█████████▌| 670/700 [04:50<00:12, 2.31it/s, loss=0.0327, lr=0.0001]\nSteps: 96%|█████████▌| 670/700 [04:50<00:12, 2.31it/s, loss=0.0921, lr=0.0001]\nSteps: 96%|█████████▌| 671/700 [04:50<00:12, 2.31it/s, loss=0.0921, lr=0.0001]\nSteps: 96%|█████████▌| 671/700 [04:50<00:12, 2.31it/s, loss=0.122, lr=0.0001] \nSteps: 96%|█████████▌| 672/700 [04:51<00:12, 2.31it/s, loss=0.122, lr=0.0001]\nSteps: 96%|█████████▌| 672/700 [04:51<00:12, 2.31it/s, loss=0.055, lr=0.0001]\nSteps: 96%|█████████▌| 673/700 [04:51<00:11, 2.30it/s, loss=0.055, lr=0.0001]\nSteps: 96%|█████████▌| 673/700 [04:51<00:11, 2.30it/s, loss=0.0406, lr=0.0001]\nSteps: 96%|█████████▋| 674/700 [04:52<00:11, 2.31it/s, loss=0.0406, lr=0.0001]\nSteps: 96%|█████████▋| 674/700 [04:52<00:11, 2.31it/s, loss=0.0989, lr=0.0001]\nSteps: 96%|█████████▋| 675/700 [04:52<00:10, 2.31it/s, loss=0.0989, lr=0.0001]\nSteps: 96%|█████████▋| 675/700 [04:52<00:10, 2.31it/s, loss=0.0807, lr=0.0001]\nSteps: 97%|█████████▋| 676/700 [04:53<00:10, 2.31it/s, loss=0.0807, lr=0.0001]\nSteps: 97%|█████████▋| 676/700 [04:53<00:10, 2.31it/s, loss=0.089, lr=0.0001] \nSteps: 97%|█████████▋| 677/700 [04:53<00:09, 2.31it/s, loss=0.089, lr=0.0001]\nSteps: 97%|█████████▋| 677/700 [04:53<00:09, 2.31it/s, loss=0.0711, lr=0.0001]\nSteps: 97%|█████████▋| 678/700 [04:53<00:09, 2.31it/s, loss=0.0711, lr=0.0001]\nSteps: 97%|█████████▋| 678/700 [04:54<00:09, 2.31it/s, loss=0.183, lr=0.0001] \nSteps: 97%|█████████▋| 679/700 [04:54<00:09, 2.31it/s, loss=0.183, lr=0.0001]\nSteps: 97%|█████████▋| 679/700 [04:54<00:09, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 97%|█████████▋| 680/700 [04:54<00:08, 2.31it/s, loss=0.118, lr=0.0001]\nSteps: 97%|█████████▋| 680/700 [04:54<00:08, 2.31it/s, loss=0.0774, lr=0.0001]\nSteps: 97%|█████████▋| 681/700 [04:55<00:08, 2.31it/s, loss=0.0774, lr=0.0001]\nSteps: 97%|█████████▋| 681/700 [04:55<00:08, 2.31it/s, loss=0.109, lr=0.0001] \nSteps: 97%|█████████▋| 682/700 [04:55<00:07, 2.31it/s, loss=0.109, lr=0.0001]\nSteps: 97%|█████████▋| 682/700 [04:55<00:07, 2.31it/s, loss=0.0268, lr=0.0001]\nSteps: 98%|█████████▊| 683/700 [04:56<00:07, 2.31it/s, loss=0.0268, lr=0.0001]\nSteps: 98%|█████████▊| 683/700 [04:56<00:07, 2.31it/s, loss=0.0848, lr=0.0001]\nSteps: 98%|█████████▊| 684/700 [04:56<00:06, 2.31it/s, loss=0.0848, lr=0.0001]\nSteps: 98%|█████████▊| 684/700 [04:56<00:06, 2.31it/s, loss=0.158, lr=0.0001] \nSteps: 98%|█████████▊| 685/700 [04:57<00:06, 2.30it/s, loss=0.158, lr=0.0001]\nSteps: 98%|█████████▊| 685/700 [04:57<00:06, 2.30it/s, loss=0.0609, lr=0.0001]\nSteps: 98%|█████████▊| 686/700 [04:57<00:06, 2.30it/s, loss=0.0609, lr=0.0001]\nSteps: 98%|█████████▊| 686/700 [04:57<00:06, 2.30it/s, loss=0.137, lr=0.0001] \nSteps: 98%|█████████▊| 687/700 [04:57<00:05, 2.31it/s, loss=0.137, lr=0.0001]\nSteps: 98%|█████████▊| 687/700 [04:57<00:05, 2.31it/s, loss=0.078, lr=0.0001]\nSteps: 98%|█████████▊| 688/700 [04:58<00:05, 2.30it/s, loss=0.078, lr=0.0001]\nSteps: 98%|█████████▊| 688/700 [04:58<00:05, 2.30it/s, loss=0.0719, lr=0.0001]\nSteps: 98%|█████████▊| 689/700 [04:58<00:04, 2.31it/s, loss=0.0719, lr=0.0001]\nSteps: 98%|█████████▊| 689/700 [04:58<00:04, 2.31it/s, loss=0.06, lr=0.0001] \nSteps: 99%|█████████▊| 690/700 [04:59<00:04, 2.31it/s, loss=0.06, lr=0.0001]\nSteps: 99%|█████████▊| 690/700 [04:59<00:04, 2.31it/s, loss=0.0883, lr=0.0001]\nSteps: 99%|█████████▊| 691/700 [04:59<00:03, 2.31it/s, loss=0.0883, lr=0.0001]\nSteps: 99%|█████████▊| 691/700 [04:59<00:03, 2.31it/s, loss=0.0885, lr=0.0001]\nSteps: 99%|█████████▉| 692/700 [05:00<00:03, 2.31it/s, loss=0.0885, lr=0.0001]\nSteps: 99%|█████████▉| 692/700 [05:00<00:03, 2.31it/s, loss=0.0699, lr=0.0001]\nSteps: 99%|█████████▉| 693/700 [05:00<00:03, 2.31it/s, loss=0.0699, lr=0.0001]\nSteps: 99%|█████████▉| 693/700 [05:00<00:03, 2.31it/s, loss=0.0816, lr=0.0001]\nSteps: 99%|█████████▉| 694/700 [05:00<00:02, 2.31it/s, loss=0.0816, lr=0.0001]\nSteps: 99%|█████████▉| 694/700 [05:00<00:02, 2.31it/s, loss=0.152, lr=0.0001] \nSteps: 99%|█████████▉| 695/700 [05:01<00:02, 2.31it/s, loss=0.152, lr=0.0001]\nSteps: 99%|█████████▉| 695/700 [05:01<00:02, 2.31it/s, loss=0.187, lr=0.0001]\nSteps: 99%|█████████▉| 696/700 [05:01<00:01, 2.31it/s, loss=0.187, lr=0.0001]\nSteps: 99%|█████████▉| 696/700 [05:01<00:01, 2.31it/s, loss=0.066, lr=0.0001]\nSteps: 100%|█████████▉| 697/700 [05:02<00:01, 2.30it/s, loss=0.066, lr=0.0001]\nSteps: 100%|█████████▉| 697/700 [05:02<00:01, 2.30it/s, loss=0.101, lr=0.0001]\nSteps: 100%|█████████▉| 698/700 [05:02<00:00, 2.30it/s, loss=0.101, lr=0.0001]\nSteps: 100%|█████████▉| 698/700 [05:02<00:00, 2.30it/s, loss=0.0533, lr=0.0001]\nSteps: 100%|█████████▉| 699/700 [05:03<00:00, 2.30it/s, loss=0.0533, lr=0.0001]\nSteps: 100%|█████████▉| 699/700 [05:03<00:00, 2.30it/s, loss=0.101, lr=0.0001] \nSteps: 100%|██████████| 700/700 [05:03<00:00, 2.31it/s, loss=0.101, lr=0.0001]\nSteps: 100%|██████████| 700/700 [05:03<00:00, 2.31it/s, loss=0.0975, lr=0.0001]Model weights saved in /tmp/train/output/sd35_large_train_replicate/pytorch_lora_weights.safetensors\nLoading pipeline components...: 0%| | 0/9 [00:00<?, ?it/s]\u001b[ALoaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of stable-diffusion-3.5-large.\n{'use_dynamic_shifting', 'max_shift', 'base_shift', 'base_image_seq_len', 'max_image_seq_len'} was not found in config. Values will be initialized to default values.\nLoaded scheduler as FlowMatchEulerDiscreteScheduler from `scheduler` subfolder of stable-diffusion-3.5-large.\nLoaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of stable-diffusion-3.5-large.\nLoading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]\u001b[A\u001b[A\nLoading checkpoint shards: 50%|█████ | 1/2 [00:05<00:05, 5.05s/it]\u001b[A\u001b[A\nLoading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.79s/it]\u001b[A\u001b[A\nLoading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.83s/it]\nLoaded text_encoder_3 as T5EncoderModel from `text_encoder_3` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 44%|████▍ | 4/9 [00:09<00:12, 2.45s/it]\u001b[ALoaded text_encoder_2 as CLIPTextModelWithProjection from `text_encoder_2` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 56%|█████▌ | 5/9 [00:11<00:08, 2.20s/it]\u001b[ALoaded text_encoder as CLIPTextModelWithProjection from `text_encoder` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 67%|██████▋ | 6/9 [00:11<00:05, 1.70s/it]\u001b[ALoaded tokenizer_3 as T5TokenizerFast from `tokenizer_3` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 78%|███████▊ | 7/9 [00:11<00:02, 1.28s/it]\u001b[ALoaded vae as AutoencoderKL from `vae` subfolder of stable-diffusion-3.5-large.\n{'dual_attention_layers'} was not found in config. Values will be initialized to default values.\nLoaded transformer as SD3Transformer2DModel from `transformer` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 100%|██████████| 9/9 [00:13<00:00, 1.14s/it]\u001b[A\nLoading pipeline components...: 100%|██████████| 9/9 [00:13<00:00, 1.53s/it]\n 0%| | 0/1 [00:00<?, ?it/s]\u001b[A\n100%|██████████| 1/1 [00:00<00:00, 1.41it/s]\u001b[A\n100%|██████████| 1/1 [00:00<00:00, 1.41it/s]\nSteps: 100%|██████████| 700/700 [05:20<00:00, 2.19it/s, loss=0.0975, lr=0.0001]\n./\n./output/\n./output/sd35_large_train_replicate/\n./output/sd35_large_train_replicate/README.md\n./output/sd35_large_train_replicate/lora.safetensors", "metrics": { "predict_time": 377.992972413, "total_time": 435.359994 }, "output": "https://replicate.delivery/yhqm/knBxQefxU6pEtUucVGpces44ewBLO2UOIUukO3wf9PovdaUdC/trained_model.tar", "started_at": "2024-10-26T05:43:15.862021Z", "status": "succeeded", "urls": { "stream": "https://stream.replicate.com/v1/files/qoxq-gaqff73bfy2r6kj3352e42p2hvpp34wvryz5jyihe4n7j3u4qhta", "get": "https://api.replicate.com/v1/predictions/2vm6qf6cfxrj20cjrxcavwtwb0", "cancel": "https://api.replicate.com/v1/predictions/2vm6qf6cfxrj20cjrxcavwtwb0/cancel" }, "version": "c60e90e9737b8c6b9775e2bc3c167d62da8a7bd7000c9244572b1b5193f3c27a" }
Generated inUsing seed: 3962600180 Extracted 12 files from zip to input_images The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well. Token is valid (permission: write). Your token has been saved to /root/.cache/huggingface/token Login successful Using params: ['accelerate', 'launch', '--dynamo_backend', 'no', 'train_dreambooth_lora_sd3.py', '--pretrained_model_name_or_path', 'stable-diffusion-3.5-large', '--instance_data_dir', 'input_images', '--rank', '16', '--output_dir', '/tmp/train/output/sd35_large_train_replicate', '--mixed_precision', 'bf16', '--instance_prompt', 'a photo of QSO dog', '--resolution', '768', '--train_batch_size', '1', '--gradient_accumulation_steps', '1', '--optimizer', 'AdamW', '--learning_rate', '0.0001', '--lr_scheduler', 'constant', '--lr_warmup_steps', '0', '--max_train_steps', '700', '--checkpointing_steps', '701', '--seed', '3962600180', '--logging_dir', '/tmp/logs', '--push_to_hub', '--hub_token', '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', '--hub_model_id', 'lucataco/SD3.5-Large-queso'] 10/26/2024 05:43:25 - INFO - __main__ - Distributed environment: DistributedType.NO Num processes: 1 Process index: 0 Local process index: 0 Device: cuda Mixed precision type: bf16 You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors. You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors. You are using a model of type t5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors. {'use_dynamic_shifting', 'max_shift', 'base_shift', 'base_image_seq_len', 'max_image_seq_len'} was not found in config. Values will be initialized to default values. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|█████ | 1/2 [00:03<00:03, 3.70s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00, 3.55s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00, 3.57s/it] {'dual_attention_layers'} was not found in config. Values will be initialized to default values. 10/26/2024 05:44:12 - INFO - __main__ - ***** Running training ***** 10/26/2024 05:44:12 - INFO - __main__ - Num examples = 12 10/26/2024 05:44:12 - INFO - __main__ - Num batches each epoch = 12 10/26/2024 05:44:12 - INFO - __main__ - Num Epochs = 59 10/26/2024 05:44:12 - INFO - __main__ - Instantaneous batch size per device = 1 10/26/2024 05:44:12 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 1 10/26/2024 05:44:12 - INFO - __main__ - Gradient Accumulation steps = 1 10/26/2024 05:44:12 - INFO - __main__ - Total optimization steps = 700 Steps: 0%| | 0/700 [00:00<?, ?it/s] Steps: 0%| | 1/700 [00:00<07:39, 1.52it/s] Steps: 0%| | 1/700 [00:00<07:39, 1.52it/s, loss=0.139, lr=0.0001] Steps: 0%| | 2/700 [00:01<05:55, 1.96it/s, loss=0.139, lr=0.0001] Steps: 0%| | 2/700 [00:01<05:55, 1.96it/s, loss=0.0678, lr=0.0001] Steps: 0%| | 3/700 [00:01<05:30, 2.11it/s, loss=0.0678, lr=0.0001] Steps: 0%| | 3/700 [00:01<05:30, 2.11it/s, loss=0.04, lr=0.0001] Steps: 1%| | 4/700 [00:01<05:18, 2.19it/s, loss=0.04, lr=0.0001] Steps: 1%| | 4/700 [00:01<05:18, 2.19it/s, loss=0.0636, lr=0.0001] Steps: 1%| | 5/700 [00:02<05:11, 2.23it/s, loss=0.0636, lr=0.0001] Steps: 1%| | 5/700 [00:02<05:11, 2.23it/s, loss=0.0621, lr=0.0001] Steps: 1%| | 6/700 [00:02<05:07, 2.26it/s, loss=0.0621, lr=0.0001] Steps: 1%| | 6/700 [00:02<05:07, 2.26it/s, loss=0.152, lr=0.0001] Steps: 1%| | 7/700 [00:03<05:04, 2.27it/s, loss=0.152, lr=0.0001] Steps: 1%| | 7/700 [00:03<05:04, 2.27it/s, loss=0.204, lr=0.0001] Steps: 1%| | 8/700 [00:03<05:02, 2.29it/s, loss=0.204, lr=0.0001] Steps: 1%| | 8/700 [00:03<05:02, 2.29it/s, loss=0.105, lr=0.0001] Steps: 1%|▏ | 9/700 [00:04<05:01, 2.29it/s, loss=0.105, lr=0.0001] Steps: 1%|▏ | 9/700 [00:04<05:01, 2.29it/s, loss=0.111, lr=0.0001] Steps: 1%|▏ | 10/700 [00:04<05:00, 2.30it/s, loss=0.111, lr=0.0001] Steps: 1%|▏ | 10/700 [00:04<05:00, 2.30it/s, loss=0.111, lr=0.0001] Steps: 2%|▏ | 11/700 [00:04<04:59, 2.30it/s, loss=0.111, lr=0.0001] Steps: 2%|▏ | 11/700 [00:04<04:59, 2.30it/s, loss=0.161, lr=0.0001] Steps: 2%|▏ | 12/700 [00:05<04:58, 2.31it/s, loss=0.161, lr=0.0001] Steps: 2%|▏ | 12/700 [00:05<04:58, 2.31it/s, loss=0.0618, lr=0.0001] Steps: 2%|▏ | 13/700 [00:05<04:58, 2.30it/s, loss=0.0618, lr=0.0001] Steps: 2%|▏ | 13/700 [00:05<04:58, 2.30it/s, loss=0.0331, lr=0.0001] Steps: 2%|▏ | 14/700 [00:06<04:57, 2.30it/s, loss=0.0331, lr=0.0001] Steps: 2%|▏ | 14/700 [00:06<04:57, 2.30it/s, loss=0.0863, lr=0.0001] Steps: 2%|▏ | 15/700 [00:06<04:56, 2.31it/s, loss=0.0863, lr=0.0001] Steps: 2%|▏ | 15/700 [00:06<04:56, 2.31it/s, loss=0.133, lr=0.0001] Steps: 2%|▏ | 16/700 [00:07<04:56, 2.31it/s, loss=0.133, lr=0.0001] Steps: 2%|▏ | 16/700 [00:07<04:56, 2.31it/s, loss=0.228, lr=0.0001] Steps: 2%|▏ | 17/700 [00:07<04:55, 2.31it/s, loss=0.228, lr=0.0001] Steps: 2%|▏ | 17/700 [00:07<04:55, 2.31it/s, loss=0.102, lr=0.0001] Steps: 3%|▎ | 18/700 [00:07<04:55, 2.31it/s, loss=0.102, lr=0.0001] Steps: 3%|▎ | 18/700 [00:08<04:55, 2.31it/s, loss=0.146, lr=0.0001] Steps: 3%|▎ | 19/700 [00:08<04:55, 2.31it/s, loss=0.146, lr=0.0001] Steps: 3%|▎ | 19/700 [00:08<04:55, 2.31it/s, loss=0.121, lr=0.0001] Steps: 3%|▎ | 20/700 [00:08<04:54, 2.31it/s, loss=0.121, lr=0.0001] Steps: 3%|▎ | 20/700 [00:08<04:54, 2.31it/s, loss=0.129, lr=0.0001] Steps: 3%|▎ | 21/700 [00:09<04:54, 2.31it/s, loss=0.129, lr=0.0001] Steps: 3%|▎ | 21/700 [00:09<04:54, 2.31it/s, loss=0.122, lr=0.0001] Steps: 3%|▎ | 22/700 [00:09<04:53, 2.31it/s, loss=0.122, lr=0.0001] Steps: 3%|▎ | 22/700 [00:09<04:53, 2.31it/s, loss=0.1, lr=0.0001] Steps: 3%|▎ | 23/700 [00:10<04:53, 2.31it/s, loss=0.1, lr=0.0001] Steps: 3%|▎ | 23/700 [00:10<04:53, 2.31it/s, loss=0.112, lr=0.0001] Steps: 3%|▎ | 24/700 [00:10<04:52, 2.31it/s, loss=0.112, lr=0.0001] Steps: 3%|▎ | 24/700 [00:10<04:52, 2.31it/s, loss=0.105, lr=0.0001] Steps: 4%|▎ | 25/700 [00:11<04:53, 2.30it/s, loss=0.105, lr=0.0001] Steps: 4%|▎ | 25/700 [00:11<04:53, 2.30it/s, loss=0.0693, lr=0.0001] Steps: 4%|▎ | 26/700 [00:11<04:52, 2.30it/s, loss=0.0693, lr=0.0001] Steps: 4%|▎ | 26/700 [00:11<04:52, 2.30it/s, loss=0.152, lr=0.0001] Steps: 4%|▍ | 27/700 [00:11<04:51, 2.31it/s, loss=0.152, lr=0.0001] Steps: 4%|▍ | 27/700 [00:11<04:51, 2.31it/s, loss=0.0932, lr=0.0001] Steps: 4%|▍ | 28/700 [00:12<04:51, 2.31it/s, loss=0.0932, lr=0.0001] Steps: 4%|▍ | 28/700 [00:12<04:51, 2.31it/s, loss=0.193, lr=0.0001] Steps: 4%|▍ | 29/700 [00:12<04:50, 2.31it/s, loss=0.193, lr=0.0001] Steps: 4%|▍ | 29/700 [00:12<04:50, 2.31it/s, loss=0.121, lr=0.0001] Steps: 4%|▍ | 30/700 [00:13<04:49, 2.31it/s, loss=0.121, lr=0.0001] Steps: 4%|▍ | 30/700 [00:13<04:49, 2.31it/s, loss=0.0597, lr=0.0001] Steps: 4%|▍ | 31/700 [00:13<04:49, 2.31it/s, loss=0.0597, lr=0.0001] Steps: 4%|▍ | 31/700 [00:13<04:49, 2.31it/s, loss=0.0413, lr=0.0001] Steps: 5%|▍ | 32/700 [00:14<04:48, 2.31it/s, loss=0.0413, lr=0.0001] Steps: 5%|▍ | 32/700 [00:14<04:48, 2.31it/s, loss=0.0705, lr=0.0001] Steps: 5%|▍ | 33/700 [00:14<04:48, 2.31it/s, loss=0.0705, lr=0.0001] Steps: 5%|▍ | 33/700 [00:14<04:48, 2.31it/s, loss=0.0996, lr=0.0001] Steps: 5%|▍ | 34/700 [00:14<04:48, 2.31it/s, loss=0.0996, lr=0.0001] Steps: 5%|▍ | 34/700 [00:14<04:48, 2.31it/s, loss=0.0392, lr=0.0001] Steps: 5%|▌ | 35/700 [00:15<04:47, 2.31it/s, loss=0.0392, lr=0.0001] Steps: 5%|▌ | 35/700 [00:15<04:47, 2.31it/s, loss=0.0874, lr=0.0001] Steps: 5%|▌ | 36/700 [00:15<04:46, 2.31it/s, loss=0.0874, lr=0.0001] Steps: 5%|▌ | 36/700 [00:15<04:46, 2.31it/s, loss=0.127, lr=0.0001] Steps: 5%|▌ | 37/700 [00:16<04:47, 2.30it/s, loss=0.127, lr=0.0001] Steps: 5%|▌ | 37/700 [00:16<04:47, 2.30it/s, loss=0.195, lr=0.0001] Steps: 5%|▌ | 38/700 [00:16<04:46, 2.31it/s, loss=0.195, lr=0.0001] Steps: 5%|▌ | 38/700 [00:16<04:46, 2.31it/s, loss=0.0707, lr=0.0001] Steps: 6%|▌ | 39/700 [00:17<04:46, 2.31it/s, loss=0.0707, lr=0.0001] Steps: 6%|▌ | 39/700 [00:17<04:46, 2.31it/s, loss=0.0302, lr=0.0001] Steps: 6%|▌ | 40/700 [00:17<04:45, 2.31it/s, loss=0.0302, lr=0.0001] Steps: 6%|▌ | 40/700 [00:17<04:45, 2.31it/s, loss=0.0603, lr=0.0001] Steps: 6%|▌ | 41/700 [00:17<04:45, 2.31it/s, loss=0.0603, lr=0.0001] Steps: 6%|▌ | 41/700 [00:17<04:45, 2.31it/s, loss=0.119, lr=0.0001] Steps: 6%|▌ | 42/700 [00:18<04:44, 2.31it/s, loss=0.119, lr=0.0001] Steps: 6%|▌ | 42/700 [00:18<04:44, 2.31it/s, loss=0.101, lr=0.0001] Steps: 6%|▌ | 43/700 [00:18<04:44, 2.31it/s, loss=0.101, lr=0.0001] Steps: 6%|▌ | 43/700 [00:18<04:44, 2.31it/s, loss=0.0303, lr=0.0001] Steps: 6%|▋ | 44/700 [00:19<04:43, 2.31it/s, loss=0.0303, lr=0.0001] Steps: 6%|▋ | 44/700 [00:19<04:43, 2.31it/s, loss=0.152, lr=0.0001] Steps: 6%|▋ | 45/700 [00:19<04:42, 2.32it/s, loss=0.152, lr=0.0001] Steps: 6%|▋ | 45/700 [00:19<04:42, 2.32it/s, loss=0.0641, lr=0.0001] Steps: 7%|▋ | 46/700 [00:20<04:42, 2.31it/s, loss=0.0641, lr=0.0001] Steps: 7%|▋ | 46/700 [00:20<04:42, 2.31it/s, loss=0.0736, lr=0.0001] Steps: 7%|▋ | 47/700 [00:20<04:42, 2.31it/s, loss=0.0736, lr=0.0001] Steps: 7%|▋ | 47/700 [00:20<04:42, 2.31it/s, loss=0.0928, lr=0.0001] Steps: 7%|▋ | 48/700 [00:20<04:41, 2.32it/s, loss=0.0928, lr=0.0001] Steps: 7%|▋ | 48/700 [00:20<04:41, 2.32it/s, loss=0.115, lr=0.0001] Steps: 7%|▋ | 49/700 [00:21<04:42, 2.30it/s, loss=0.115, lr=0.0001] Steps: 7%|▋ | 49/700 [00:21<04:42, 2.30it/s, loss=0.105, lr=0.0001] Steps: 7%|▋ | 50/700 [00:21<04:41, 2.31it/s, loss=0.105, lr=0.0001] Steps: 7%|▋ | 50/700 [00:21<04:41, 2.31it/s, loss=0.0713, lr=0.0001] Steps: 7%|▋ | 51/700 [00:22<04:41, 2.31it/s, loss=0.0713, lr=0.0001] Steps: 7%|▋ | 51/700 [00:22<04:41, 2.31it/s, loss=0.0728, lr=0.0001] Steps: 7%|▋ | 52/700 [00:22<04:40, 2.31it/s, loss=0.0728, lr=0.0001] Steps: 7%|▋ | 52/700 [00:22<04:40, 2.31it/s, loss=0.0927, lr=0.0001] Steps: 8%|▊ | 53/700 [00:23<04:39, 2.31it/s, loss=0.0927, lr=0.0001] Steps: 8%|▊ | 53/700 [00:23<04:39, 2.31it/s, loss=0.119, lr=0.0001] Steps: 8%|▊ | 54/700 [00:23<04:39, 2.31it/s, loss=0.119, lr=0.0001] Steps: 8%|▊ | 54/700 [00:23<04:39, 2.31it/s, loss=0.0595, lr=0.0001] Steps: 8%|▊ | 55/700 [00:24<04:38, 2.31it/s, loss=0.0595, lr=0.0001] Steps: 8%|▊ | 55/700 [00:24<04:38, 2.31it/s, loss=0.168, lr=0.0001] Steps: 8%|▊ | 56/700 [00:24<04:38, 2.31it/s, loss=0.168, lr=0.0001] Steps: 8%|▊ | 56/700 [00:24<04:38, 2.31it/s, loss=0.114, lr=0.0001] Steps: 8%|▊ | 57/700 [00:24<04:37, 2.31it/s, loss=0.114, lr=0.0001] Steps: 8%|▊ | 57/700 [00:24<04:37, 2.31it/s, loss=0.191, lr=0.0001] Steps: 8%|▊ | 58/700 [00:25<04:37, 2.31it/s, loss=0.191, lr=0.0001] Steps: 8%|▊ | 58/700 [00:25<04:37, 2.31it/s, loss=0.143, lr=0.0001] Steps: 8%|▊ | 59/700 [00:25<04:37, 2.31it/s, loss=0.143, lr=0.0001] Steps: 8%|▊ | 59/700 [00:25<04:37, 2.31it/s, loss=0.068, lr=0.0001] Steps: 9%|▊ | 60/700 [00:26<04:36, 2.32it/s, loss=0.068, lr=0.0001] Steps: 9%|▊ | 60/700 [00:26<04:36, 2.32it/s, loss=0.0855, lr=0.0001] Steps: 9%|▊ | 61/700 [00:26<04:37, 2.30it/s, loss=0.0855, lr=0.0001] Steps: 9%|▊ | 61/700 [00:26<04:37, 2.30it/s, loss=0.0649, lr=0.0001] Steps: 9%|▉ | 62/700 [00:27<04:36, 2.31it/s, loss=0.0649, lr=0.0001] Steps: 9%|▉ | 62/700 [00:27<04:36, 2.31it/s, loss=0.0905, lr=0.0001] Steps: 9%|▉ | 63/700 [00:27<04:35, 2.31it/s, loss=0.0905, lr=0.0001] Steps: 9%|▉ | 63/700 [00:27<04:35, 2.31it/s, loss=0.0868, lr=0.0001] Steps: 9%|▉ | 64/700 [00:27<04:35, 2.31it/s, loss=0.0868, lr=0.0001] Steps: 9%|▉ | 64/700 [00:27<04:35, 2.31it/s, loss=0.0788, lr=0.0001] Steps: 9%|▉ | 65/700 [00:28<04:34, 2.31it/s, loss=0.0788, lr=0.0001] Steps: 9%|▉ | 65/700 [00:28<04:34, 2.31it/s, loss=0.132, lr=0.0001] Steps: 9%|▉ | 66/700 [00:28<04:34, 2.31it/s, loss=0.132, lr=0.0001] Steps: 9%|▉ | 66/700 [00:28<04:34, 2.31it/s, loss=0.122, lr=0.0001] Steps: 10%|▉ | 67/700 [00:29<04:33, 2.31it/s, loss=0.122, lr=0.0001] Steps: 10%|▉ | 67/700 [00:29<04:33, 2.31it/s, loss=0.0693, lr=0.0001] Steps: 10%|▉ | 68/700 [00:29<04:33, 2.31it/s, loss=0.0693, lr=0.0001] Steps: 10%|▉ | 68/700 [00:29<04:33, 2.31it/s, loss=0.111, lr=0.0001] Steps: 10%|▉ | 69/700 [00:30<04:32, 2.31it/s, loss=0.111, lr=0.0001] Steps: 10%|▉ | 69/700 [00:30<04:32, 2.31it/s, loss=0.0441, lr=0.0001] Steps: 10%|█ | 70/700 [00:30<04:32, 2.31it/s, loss=0.0441, lr=0.0001] Steps: 10%|█ | 70/700 [00:30<04:32, 2.31it/s, loss=0.112, lr=0.0001] Steps: 10%|█ | 71/700 [00:30<04:31, 2.31it/s, loss=0.112, lr=0.0001] Steps: 10%|█ | 71/700 [00:30<04:31, 2.31it/s, loss=0.1, lr=0.0001] Steps: 10%|█ | 72/700 [00:31<04:31, 2.32it/s, loss=0.1, lr=0.0001] Steps: 10%|█ | 72/700 [00:31<04:31, 2.32it/s, loss=0.3, lr=0.0001] Steps: 10%|█ | 73/700 [00:31<04:32, 2.30it/s, loss=0.3, lr=0.0001] Steps: 10%|█ | 73/700 [00:31<04:32, 2.30it/s, loss=0.132, lr=0.0001] Steps: 11%|█ | 74/700 [00:32<04:31, 2.31it/s, loss=0.132, lr=0.0001] Steps: 11%|█ | 74/700 [00:32<04:31, 2.31it/s, loss=0.0758, lr=0.0001] Steps: 11%|█ | 75/700 [00:32<04:30, 2.31it/s, loss=0.0758, lr=0.0001] Steps: 11%|█ | 75/700 [00:32<04:30, 2.31it/s, loss=0.107, lr=0.0001] Steps: 11%|█ | 76/700 [00:33<04:30, 2.31it/s, loss=0.107, lr=0.0001] Steps: 11%|█ | 76/700 [00:33<04:30, 2.31it/s, loss=0.0793, lr=0.0001] Steps: 11%|█ | 77/700 [00:33<04:29, 2.31it/s, loss=0.0793, lr=0.0001] Steps: 11%|█ | 77/700 [00:33<04:29, 2.31it/s, loss=0.0566, lr=0.0001] Steps: 11%|█ | 78/700 [00:33<04:28, 2.31it/s, loss=0.0566, lr=0.0001] Steps: 11%|█ | 78/700 [00:33<04:28, 2.31it/s, loss=0.187, lr=0.0001] Steps: 11%|█▏ | 79/700 [00:34<04:28, 2.31it/s, loss=0.187, lr=0.0001] Steps: 11%|█▏ | 79/700 [00:34<04:28, 2.31it/s, loss=0.138, lr=0.0001] Steps: 11%|█▏ | 80/700 [00:34<04:28, 2.31it/s, loss=0.138, lr=0.0001] Steps: 11%|█▏ | 80/700 [00:34<04:28, 2.31it/s, loss=0.141, lr=0.0001] Steps: 12%|█▏ | 81/700 [00:35<04:27, 2.31it/s, loss=0.141, lr=0.0001] Steps: 12%|█▏ | 81/700 [00:35<04:27, 2.31it/s, loss=0.0718, lr=0.0001] Steps: 12%|█▏ | 82/700 [00:35<04:26, 2.32it/s, loss=0.0718, lr=0.0001] Steps: 12%|█▏ | 82/700 [00:35<04:26, 2.32it/s, loss=0.134, lr=0.0001] Steps: 12%|█▏ | 83/700 [00:36<04:26, 2.31it/s, loss=0.134, lr=0.0001] Steps: 12%|█▏ | 83/700 [00:36<04:26, 2.31it/s, loss=0.19, lr=0.0001] Steps: 12%|█▏ | 84/700 [00:36<04:26, 2.32it/s, loss=0.19, lr=0.0001] Steps: 12%|█▏ | 84/700 [00:36<04:26, 2.32it/s, loss=0.157, lr=0.0001] Steps: 12%|█▏ | 85/700 [00:36<04:26, 2.30it/s, loss=0.157, lr=0.0001] Steps: 12%|█▏ | 85/700 [00:37<04:26, 2.30it/s, loss=0.0392, lr=0.0001] Steps: 12%|█▏ | 86/700 [00:37<04:26, 2.31it/s, loss=0.0392, lr=0.0001] Steps: 12%|█▏ | 86/700 [00:37<04:26, 2.31it/s, loss=0.223, lr=0.0001] Steps: 12%|█▏ | 87/700 [00:37<04:25, 2.31it/s, loss=0.223, lr=0.0001] Steps: 12%|█▏ | 87/700 [00:37<04:25, 2.31it/s, loss=0.0923, lr=0.0001] Steps: 13%|█▎ | 88/700 [00:38<04:24, 2.31it/s, loss=0.0923, lr=0.0001] Steps: 13%|█▎ | 88/700 [00:38<04:24, 2.31it/s, loss=0.0809, lr=0.0001] Steps: 13%|█▎ | 89/700 [00:38<04:24, 2.31it/s, loss=0.0809, lr=0.0001] Steps: 13%|█▎ | 89/700 [00:38<04:24, 2.31it/s, loss=0.0959, lr=0.0001] Steps: 13%|█▎ | 90/700 [00:39<04:23, 2.31it/s, loss=0.0959, lr=0.0001] Steps: 13%|█▎ | 90/700 [00:39<04:23, 2.31it/s, loss=0.0515, lr=0.0001] Steps: 13%|█▎ | 91/700 [00:39<04:23, 2.31it/s, loss=0.0515, lr=0.0001] Steps: 13%|█▎ | 91/700 [00:39<04:23, 2.31it/s, loss=0.0861, lr=0.0001] Steps: 13%|█▎ | 92/700 [00:40<04:22, 2.31it/s, loss=0.0861, lr=0.0001] Steps: 13%|█▎ | 92/700 [00:40<04:22, 2.31it/s, loss=0.0618, lr=0.0001] Steps: 13%|█▎ | 93/700 [00:40<04:22, 2.31it/s, loss=0.0618, lr=0.0001] Steps: 13%|█▎ | 93/700 [00:40<04:22, 2.31it/s, loss=0.0733, lr=0.0001] Steps: 13%|█▎ | 94/700 [00:40<04:21, 2.31it/s, loss=0.0733, lr=0.0001] Steps: 13%|█▎ | 94/700 [00:40<04:21, 2.31it/s, loss=0.164, lr=0.0001] Steps: 14%|█▎ | 95/700 [00:41<04:21, 2.32it/s, loss=0.164, lr=0.0001] Steps: 14%|█▎ | 95/700 [00:41<04:21, 2.32it/s, loss=0.123, lr=0.0001] Steps: 14%|█▎ | 96/700 [00:41<04:20, 2.32it/s, loss=0.123, lr=0.0001] Steps: 14%|█▎ | 96/700 [00:41<04:20, 2.32it/s, loss=0.185, lr=0.0001] Steps: 14%|█▍ | 97/700 [00:42<04:21, 2.30it/s, loss=0.185, lr=0.0001] Steps: 14%|█▍ | 97/700 [00:42<04:21, 2.30it/s, loss=0.0795, lr=0.0001] Steps: 14%|█▍ | 98/700 [00:42<04:20, 2.31it/s, loss=0.0795, lr=0.0001] Steps: 14%|█▍ | 98/700 [00:42<04:20, 2.31it/s, loss=0.124, lr=0.0001] Steps: 14%|█▍ | 99/700 [00:43<04:20, 2.31it/s, loss=0.124, lr=0.0001] Steps: 14%|█▍ | 99/700 [00:43<04:20, 2.31it/s, loss=0.157, lr=0.0001] Steps: 14%|█▍ | 100/700 [00:43<04:19, 2.31it/s, loss=0.157, lr=0.0001] Steps: 14%|█▍ | 100/700 [00:43<04:19, 2.31it/s, loss=0.0614, lr=0.0001] Steps: 14%|█▍ | 101/700 [00:43<04:19, 2.31it/s, loss=0.0614, lr=0.0001] Steps: 14%|█▍ | 101/700 [00:43<04:19, 2.31it/s, loss=0.0955, lr=0.0001] Steps: 15%|█▍ | 102/700 [00:44<04:18, 2.31it/s, loss=0.0955, lr=0.0001] Steps: 15%|█▍ | 102/700 [00:44<04:18, 2.31it/s, loss=0.0545, lr=0.0001] Steps: 15%|█▍ | 103/700 [00:44<04:18, 2.31it/s, loss=0.0545, lr=0.0001] Steps: 15%|█▍ | 103/700 [00:44<04:18, 2.31it/s, loss=0.168, lr=0.0001] Steps: 15%|█▍ | 104/700 [00:45<04:17, 2.32it/s, loss=0.168, lr=0.0001] Steps: 15%|█▍ | 104/700 [00:45<04:17, 2.32it/s, loss=0.0944, lr=0.0001] Steps: 15%|█▌ | 105/700 [00:45<04:16, 2.32it/s, loss=0.0944, lr=0.0001] Steps: 15%|█▌ | 105/700 [00:45<04:16, 2.32it/s, loss=0.0917, lr=0.0001] Steps: 15%|█▌ | 106/700 [00:46<04:16, 2.31it/s, loss=0.0917, lr=0.0001] Steps: 15%|█▌ | 106/700 [00:46<04:16, 2.31it/s, loss=0.0696, lr=0.0001] Steps: 15%|█▌ | 107/700 [00:46<04:16, 2.31it/s, loss=0.0696, lr=0.0001] Steps: 15%|█▌ | 107/700 [00:46<04:16, 2.31it/s, loss=0.15, lr=0.0001] Steps: 15%|█▌ | 108/700 [00:46<04:15, 2.32it/s, loss=0.15, lr=0.0001] Steps: 15%|█▌ | 108/700 [00:46<04:15, 2.32it/s, loss=0.0707, lr=0.0001] Steps: 16%|█▌ | 109/700 [00:47<04:16, 2.30it/s, loss=0.0707, lr=0.0001] Steps: 16%|█▌ | 109/700 [00:47<04:16, 2.30it/s, loss=0.281, lr=0.0001] Steps: 16%|█▌ | 110/700 [00:47<04:15, 2.31it/s, loss=0.281, lr=0.0001] Steps: 16%|█▌ | 110/700 [00:47<04:15, 2.31it/s, loss=0.0787, lr=0.0001] Steps: 16%|█▌ | 111/700 [00:48<04:14, 2.31it/s, loss=0.0787, lr=0.0001] Steps: 16%|█▌ | 111/700 [00:48<04:14, 2.31it/s, loss=0.139, lr=0.0001] Steps: 16%|█▌ | 112/700 [00:48<04:14, 2.31it/s, loss=0.139, lr=0.0001] Steps: 16%|█▌ | 112/700 [00:48<04:14, 2.31it/s, loss=0.15, lr=0.0001] Steps: 16%|█▌ | 113/700 [00:49<04:13, 2.31it/s, loss=0.15, lr=0.0001] Steps: 16%|█▌ | 113/700 [00:49<04:13, 2.31it/s, loss=0.0713, lr=0.0001] Steps: 16%|█▋ | 114/700 [00:49<04:13, 2.31it/s, loss=0.0713, lr=0.0001] Steps: 16%|█▋ | 114/700 [00:49<04:13, 2.31it/s, loss=0.0331, lr=0.0001] Steps: 16%|█▋ | 115/700 [00:49<04:12, 2.31it/s, loss=0.0331, lr=0.0001] Steps: 16%|█▋ | 115/700 [00:49<04:12, 2.31it/s, loss=0.0542, lr=0.0001] Steps: 17%|█▋ | 116/700 [00:50<04:12, 2.31it/s, loss=0.0542, lr=0.0001] Steps: 17%|█▋ | 116/700 [00:50<04:12, 2.31it/s, loss=0.082, lr=0.0001] Steps: 17%|█▋ | 117/700 [00:50<04:12, 2.31it/s, loss=0.082, lr=0.0001] Steps: 17%|█▋ | 117/700 [00:50<04:12, 2.31it/s, loss=0.215, lr=0.0001] Steps: 17%|█▋ | 118/700 [00:51<04:11, 2.31it/s, loss=0.215, lr=0.0001] Steps: 17%|█▋ | 118/700 [00:51<04:11, 2.31it/s, loss=0.0356, lr=0.0001] Steps: 17%|█▋ | 119/700 [00:51<04:11, 2.31it/s, loss=0.0356, lr=0.0001] Steps: 17%|█▋ | 119/700 [00:51<04:11, 2.31it/s, loss=0.156, lr=0.0001] Steps: 17%|█▋ | 120/700 [00:52<04:10, 2.31it/s, loss=0.156, lr=0.0001] Steps: 17%|█▋ | 120/700 [00:52<04:10, 2.31it/s, loss=0.379, lr=0.0001] Steps: 17%|█▋ | 121/700 [00:52<04:11, 2.30it/s, loss=0.379, lr=0.0001] Steps: 17%|█▋ | 121/700 [00:52<04:11, 2.30it/s, loss=0.123, lr=0.0001] Steps: 17%|█▋ | 122/700 [00:52<04:10, 2.31it/s, loss=0.123, lr=0.0001] Steps: 17%|█▋ | 122/700 [00:53<04:10, 2.31it/s, loss=0.113, lr=0.0001] Steps: 18%|█▊ | 123/700 [00:53<04:09, 2.31it/s, loss=0.113, lr=0.0001] Steps: 18%|█▊ | 123/700 [00:53<04:09, 2.31it/s, loss=0.111, lr=0.0001] Steps: 18%|█▊ | 124/700 [00:53<04:09, 2.31it/s, loss=0.111, lr=0.0001] Steps: 18%|█▊ | 124/700 [00:53<04:09, 2.31it/s, loss=0.042, lr=0.0001] Steps: 18%|█▊ | 125/700 [00:54<04:08, 2.31it/s, loss=0.042, lr=0.0001] Steps: 18%|█▊ | 125/700 [00:54<04:08, 2.31it/s, loss=0.134, lr=0.0001] Steps: 18%|█▊ | 126/700 [00:54<04:08, 2.31it/s, loss=0.134, lr=0.0001] Steps: 18%|█▊ | 126/700 [00:54<04:08, 2.31it/s, loss=0.136, lr=0.0001] Steps: 18%|█▊ | 127/700 [00:55<04:07, 2.31it/s, loss=0.136, lr=0.0001] Steps: 18%|█▊ | 127/700 [00:55<04:07, 2.31it/s, loss=0.0841, lr=0.0001] Steps: 18%|█▊ | 128/700 [00:55<04:07, 2.31it/s, loss=0.0841, lr=0.0001] Steps: 18%|█▊ | 128/700 [00:55<04:07, 2.31it/s, loss=0.0609, lr=0.0001] Steps: 18%|█▊ | 129/700 [00:56<04:06, 2.32it/s, loss=0.0609, lr=0.0001] Steps: 18%|█▊ | 129/700 [00:56<04:06, 2.32it/s, loss=0.154, lr=0.0001] Steps: 19%|█▊ | 130/700 [00:56<04:06, 2.31it/s, loss=0.154, lr=0.0001] Steps: 19%|█▊ | 130/700 [00:56<04:06, 2.31it/s, loss=0.0725, lr=0.0001] Steps: 19%|█▊ | 131/700 [00:56<04:05, 2.31it/s, loss=0.0725, lr=0.0001] Steps: 19%|█▊ | 131/700 [00:56<04:05, 2.31it/s, loss=0.112, lr=0.0001] Steps: 19%|█▉ | 132/700 [00:57<04:05, 2.32it/s, loss=0.112, lr=0.0001] Steps: 19%|█▉ | 132/700 [00:57<04:05, 2.32it/s, loss=0.0866, lr=0.0001] Steps: 19%|█▉ | 133/700 [00:57<04:06, 2.30it/s, loss=0.0866, lr=0.0001] Steps: 19%|█▉ | 133/700 [00:57<04:06, 2.30it/s, loss=0.0815, lr=0.0001] Steps: 19%|█▉ | 134/700 [00:58<04:05, 2.31it/s, loss=0.0815, lr=0.0001] Steps: 19%|█▉ | 134/700 [00:58<04:05, 2.31it/s, loss=0.0781, lr=0.0001] Steps: 19%|█▉ | 135/700 [00:58<04:04, 2.31it/s, loss=0.0781, lr=0.0001] Steps: 19%|█▉ | 135/700 [00:58<04:04, 2.31it/s, loss=0.0736, lr=0.0001] Steps: 19%|█▉ | 136/700 [00:59<04:04, 2.31it/s, loss=0.0736, lr=0.0001] Steps: 19%|█▉ | 136/700 [00:59<04:04, 2.31it/s, loss=0.0696, lr=0.0001] Steps: 20%|█▉ | 137/700 [00:59<04:03, 2.31it/s, loss=0.0696, lr=0.0001] Steps: 20%|█▉ | 137/700 [00:59<04:03, 2.31it/s, loss=0.0871, lr=0.0001] Steps: 20%|█▉ | 138/700 [00:59<04:02, 2.31it/s, loss=0.0871, lr=0.0001] Steps: 20%|█▉ | 138/700 [00:59<04:02, 2.31it/s, loss=0.0361, lr=0.0001] Steps: 20%|█▉ | 139/700 [01:00<04:02, 2.31it/s, loss=0.0361, lr=0.0001] Steps: 20%|█▉ | 139/700 [01:00<04:02, 2.31it/s, loss=0.0547, lr=0.0001] Steps: 20%|██ | 140/700 [01:00<04:02, 2.31it/s, loss=0.0547, lr=0.0001] Steps: 20%|██ | 140/700 [01:00<04:02, 2.31it/s, loss=0.0273, lr=0.0001] Steps: 20%|██ | 141/700 [01:01<04:01, 2.32it/s, loss=0.0273, lr=0.0001] Steps: 20%|██ | 141/700 [01:01<04:01, 2.32it/s, loss=0.0602, lr=0.0001] Steps: 20%|██ | 142/700 [01:01<04:00, 2.32it/s, loss=0.0602, lr=0.0001] Steps: 20%|██ | 142/700 [01:01<04:00, 2.32it/s, loss=0.159, lr=0.0001] Steps: 20%|██ | 143/700 [01:02<04:00, 2.32it/s, loss=0.159, lr=0.0001] Steps: 20%|██ | 143/700 [01:02<04:00, 2.32it/s, loss=0.0487, lr=0.0001] Steps: 21%|██ | 144/700 [01:02<04:00, 2.32it/s, loss=0.0487, lr=0.0001] Steps: 21%|██ | 144/700 [01:02<04:00, 2.32it/s, loss=0.0591, lr=0.0001] Steps: 21%|██ | 145/700 [01:02<04:00, 2.30it/s, loss=0.0591, lr=0.0001] Steps: 21%|██ | 145/700 [01:02<04:00, 2.30it/s, loss=0.0889, lr=0.0001] Steps: 21%|██ | 146/700 [01:03<04:00, 2.30it/s, loss=0.0889, lr=0.0001] Steps: 21%|██ | 146/700 [01:03<04:00, 2.30it/s, loss=0.109, lr=0.0001] Steps: 21%|██ | 147/700 [01:03<03:59, 2.31it/s, loss=0.109, lr=0.0001] Steps: 21%|██ | 147/700 [01:03<03:59, 2.31it/s, loss=0.0888, lr=0.0001] Steps: 21%|██ | 148/700 [01:04<03:58, 2.31it/s, loss=0.0888, lr=0.0001] Steps: 21%|██ | 148/700 [01:04<03:58, 2.31it/s, loss=0.163, lr=0.0001] Steps: 21%|██▏ | 149/700 [01:04<03:58, 2.31it/s, loss=0.163, lr=0.0001] Steps: 21%|██▏ | 149/700 [01:04<03:58, 2.31it/s, loss=0.132, lr=0.0001] Steps: 21%|██▏ | 150/700 [01:05<03:57, 2.31it/s, loss=0.132, lr=0.0001] Steps: 21%|██▏ | 150/700 [01:05<03:57, 2.31it/s, loss=0.163, lr=0.0001] Steps: 22%|██▏ | 151/700 [01:05<03:57, 2.31it/s, loss=0.163, lr=0.0001] Steps: 22%|██▏ | 151/700 [01:05<03:57, 2.31it/s, loss=0.0821, lr=0.0001] Steps: 22%|██▏ | 152/700 [01:05<03:57, 2.31it/s, loss=0.0821, lr=0.0001] Steps: 22%|██▏ | 152/700 [01:05<03:57, 2.31it/s, loss=0.136, lr=0.0001] Steps: 22%|██▏ | 153/700 [01:06<03:56, 2.31it/s, loss=0.136, lr=0.0001] Steps: 22%|██▏ | 153/700 [01:06<03:56, 2.31it/s, loss=0.0459, lr=0.0001] Steps: 22%|██▏ | 154/700 [01:06<03:56, 2.31it/s, loss=0.0459, lr=0.0001] Steps: 22%|██▏ | 154/700 [01:06<03:56, 2.31it/s, loss=0.106, lr=0.0001] Steps: 22%|██▏ | 155/700 [01:07<03:55, 2.31it/s, loss=0.106, lr=0.0001] Steps: 22%|██▏ | 155/700 [01:07<03:55, 2.31it/s, loss=0.0971, lr=0.0001] Steps: 22%|██▏ | 156/700 [01:07<03:55, 2.31it/s, loss=0.0971, lr=0.0001] Steps: 22%|██▏ | 156/700 [01:07<03:55, 2.31it/s, loss=0.0542, lr=0.0001] Steps: 22%|██▏ | 157/700 [01:08<03:56, 2.30it/s, loss=0.0542, lr=0.0001] Steps: 22%|██▏ | 157/700 [01:08<03:56, 2.30it/s, loss=0.078, lr=0.0001] Steps: 23%|██▎ | 158/700 [01:08<03:55, 2.30it/s, loss=0.078, lr=0.0001] Steps: 23%|██▎ | 158/700 [01:08<03:55, 2.30it/s, loss=0.106, lr=0.0001] Steps: 23%|██▎ | 159/700 [01:08<03:54, 2.31it/s, loss=0.106, lr=0.0001] Steps: 23%|██▎ | 159/700 [01:09<03:54, 2.31it/s, loss=0.0751, lr=0.0001] Steps: 23%|██▎ | 160/700 [01:09<03:53, 2.31it/s, loss=0.0751, lr=0.0001] Steps: 23%|██▎ | 160/700 [01:09<03:53, 2.31it/s, loss=0.178, lr=0.0001] Steps: 23%|██▎ | 161/700 [01:09<03:53, 2.31it/s, loss=0.178, lr=0.0001] Steps: 23%|██▎ | 161/700 [01:09<03:53, 2.31it/s, loss=0.0641, lr=0.0001] Steps: 23%|██▎ | 162/700 [01:10<03:52, 2.31it/s, loss=0.0641, lr=0.0001] Steps: 23%|██▎ | 162/700 [01:10<03:52, 2.31it/s, loss=0.187, lr=0.0001] Steps: 23%|██▎ | 163/700 [01:10<03:52, 2.31it/s, loss=0.187, lr=0.0001] Steps: 23%|██▎ | 163/700 [01:10<03:52, 2.31it/s, loss=0.237, lr=0.0001] Steps: 23%|██▎ | 164/700 [01:11<03:51, 2.31it/s, loss=0.237, lr=0.0001] Steps: 23%|██▎ | 164/700 [01:11<03:51, 2.31it/s, loss=0.0783, lr=0.0001] Steps: 24%|██▎ | 165/700 [01:11<03:51, 2.31it/s, loss=0.0783, lr=0.0001] Steps: 24%|██▎ | 165/700 [01:11<03:51, 2.31it/s, loss=0.0929, lr=0.0001] Steps: 24%|██▎ | 166/700 [01:12<03:50, 2.31it/s, loss=0.0929, lr=0.0001] Steps: 24%|██▎ | 166/700 [01:12<03:50, 2.31it/s, loss=0.168, lr=0.0001] Steps: 24%|██▍ | 167/700 [01:12<03:50, 2.31it/s, loss=0.168, lr=0.0001] Steps: 24%|██▍ | 167/700 [01:12<03:50, 2.31it/s, loss=0.0386, lr=0.0001] Steps: 24%|██▍ | 168/700 [01:12<03:49, 2.31it/s, loss=0.0386, lr=0.0001] Steps: 24%|██▍ | 168/700 [01:12<03:49, 2.31it/s, loss=0.047, lr=0.0001] Steps: 24%|██▍ | 169/700 [01:13<03:50, 2.30it/s, loss=0.047, lr=0.0001] Steps: 24%|██▍ | 169/700 [01:13<03:50, 2.30it/s, loss=0.0313, lr=0.0001] Steps: 24%|██▍ | 170/700 [01:13<03:50, 2.30it/s, loss=0.0313, lr=0.0001] Steps: 24%|██▍ | 170/700 [01:13<03:50, 2.30it/s, loss=0.128, lr=0.0001] Steps: 24%|██▍ | 171/700 [01:14<03:49, 2.31it/s, loss=0.128, lr=0.0001] Steps: 24%|██▍ | 171/700 [01:14<03:49, 2.31it/s, loss=0.145, lr=0.0001] Steps: 25%|██▍ | 172/700 [01:14<03:48, 2.31it/s, loss=0.145, lr=0.0001] Steps: 25%|██▍ | 172/700 [01:14<03:48, 2.31it/s, loss=0.0553, lr=0.0001] Steps: 25%|██▍ | 173/700 [01:15<03:48, 2.31it/s, loss=0.0553, lr=0.0001] Steps: 25%|██▍ | 173/700 [01:15<03:48, 2.31it/s, loss=0.137, lr=0.0001] Steps: 25%|██▍ | 174/700 [01:15<03:47, 2.31it/s, loss=0.137, lr=0.0001] Steps: 25%|██▍ | 174/700 [01:15<03:47, 2.31it/s, loss=0.0654, lr=0.0001] Steps: 25%|██▌ | 175/700 [01:15<03:47, 2.31it/s, loss=0.0654, lr=0.0001] Steps: 25%|██▌ | 175/700 [01:15<03:47, 2.31it/s, loss=0.128, lr=0.0001] Steps: 25%|██▌ | 176/700 [01:16<03:46, 2.31it/s, loss=0.128, lr=0.0001] Steps: 25%|██▌ | 176/700 [01:16<03:46, 2.31it/s, loss=0.31, lr=0.0001] Steps: 25%|██▌ | 177/700 [01:16<03:46, 2.31it/s, loss=0.31, lr=0.0001] Steps: 25%|██▌ | 177/700 [01:16<03:46, 2.31it/s, loss=0.0623, lr=0.0001] Steps: 25%|██▌ | 178/700 [01:17<03:45, 2.31it/s, loss=0.0623, lr=0.0001] Steps: 25%|██▌ | 178/700 [01:17<03:45, 2.31it/s, loss=0.102, lr=0.0001] Steps: 26%|██▌ | 179/700 [01:17<03:45, 2.31it/s, loss=0.102, lr=0.0001] Steps: 26%|██▌ | 179/700 [01:17<03:45, 2.31it/s, loss=0.101, lr=0.0001] Steps: 26%|██▌ | 180/700 [01:18<03:44, 2.31it/s, loss=0.101, lr=0.0001] Steps: 26%|██▌ | 180/700 [01:18<03:44, 2.31it/s, loss=0.0696, lr=0.0001] Steps: 26%|██▌ | 181/700 [01:18<03:45, 2.30it/s, loss=0.0696, lr=0.0001] Steps: 26%|██▌ | 181/700 [01:18<03:45, 2.30it/s, loss=0.156, lr=0.0001] Steps: 26%|██▌ | 182/700 [01:18<03:44, 2.30it/s, loss=0.156, lr=0.0001] Steps: 26%|██▌ | 182/700 [01:18<03:44, 2.30it/s, loss=0.0437, lr=0.0001] Steps: 26%|██▌ | 183/700 [01:19<03:44, 2.31it/s, loss=0.0437, lr=0.0001] Steps: 26%|██▌ | 183/700 [01:19<03:44, 2.31it/s, loss=0.0516, lr=0.0001] Steps: 26%|██▋ | 184/700 [01:19<03:43, 2.31it/s, loss=0.0516, lr=0.0001] Steps: 26%|██▋ | 184/700 [01:19<03:43, 2.31it/s, loss=0.198, lr=0.0001] Steps: 26%|██▋ | 185/700 [01:20<03:43, 2.31it/s, loss=0.198, lr=0.0001] Steps: 26%|██▋ | 185/700 [01:20<03:43, 2.31it/s, loss=0.0919, lr=0.0001] Steps: 27%|██▋ | 186/700 [01:20<03:42, 2.31it/s, loss=0.0919, lr=0.0001] Steps: 27%|██▋ | 186/700 [01:20<03:42, 2.31it/s, loss=0.0468, lr=0.0001] Steps: 27%|██▋ | 187/700 [01:21<03:42, 2.31it/s, loss=0.0468, lr=0.0001] Steps: 27%|██▋ | 187/700 [01:21<03:42, 2.31it/s, loss=0.103, lr=0.0001] Steps: 27%|██▋ | 188/700 [01:21<03:41, 2.31it/s, loss=0.103, lr=0.0001] Steps: 27%|██▋ | 188/700 [01:21<03:41, 2.31it/s, loss=0.21, lr=0.0001] Steps: 27%|██▋ | 189/700 [01:21<03:41, 2.31it/s, loss=0.21, lr=0.0001] Steps: 27%|██▋ | 189/700 [01:22<03:41, 2.31it/s, loss=0.19, lr=0.0001] Steps: 27%|██▋ | 190/700 [01:22<03:40, 2.31it/s, loss=0.19, lr=0.0001] Steps: 27%|██▋ | 190/700 [01:22<03:40, 2.31it/s, loss=0.0909, lr=0.0001] Steps: 27%|██▋ | 191/700 [01:22<03:40, 2.31it/s, loss=0.0909, lr=0.0001] Steps: 27%|██▋ | 191/700 [01:22<03:40, 2.31it/s, loss=0.138, lr=0.0001] Steps: 27%|██▋ | 192/700 [01:23<03:39, 2.31it/s, loss=0.138, lr=0.0001] Steps: 27%|██▋ | 192/700 [01:23<03:39, 2.31it/s, loss=0.0615, lr=0.0001] Steps: 28%|██▊ | 193/700 [01:23<03:40, 2.30it/s, loss=0.0615, lr=0.0001] Steps: 28%|██▊ | 193/700 [01:23<03:40, 2.30it/s, loss=0.0493, lr=0.0001] Steps: 28%|██▊ | 194/700 [01:24<03:39, 2.30it/s, loss=0.0493, lr=0.0001] Steps: 28%|██▊ | 194/700 [01:24<03:39, 2.30it/s, loss=0.0843, lr=0.0001] Steps: 28%|██▊ | 195/700 [01:24<03:38, 2.31it/s, loss=0.0843, lr=0.0001] Steps: 28%|██▊ | 195/700 [01:24<03:38, 2.31it/s, loss=0.126, lr=0.0001] Steps: 28%|██▊ | 196/700 [01:25<03:38, 2.31it/s, loss=0.126, lr=0.0001] Steps: 28%|██▊ | 196/700 [01:25<03:38, 2.31it/s, loss=0.288, lr=0.0001] Steps: 28%|██▊ | 197/700 [01:25<03:37, 2.31it/s, loss=0.288, lr=0.0001] Steps: 28%|██▊ | 197/700 [01:25<03:37, 2.31it/s, loss=0.237, lr=0.0001] Steps: 28%|██▊ | 198/700 [01:25<03:37, 2.31it/s, loss=0.237, lr=0.0001] Steps: 28%|██▊ | 198/700 [01:25<03:37, 2.31it/s, loss=0.121, lr=0.0001] Steps: 28%|██▊ | 199/700 [01:26<03:36, 2.31it/s, loss=0.121, lr=0.0001] Steps: 28%|██▊ | 199/700 [01:26<03:36, 2.31it/s, loss=0.102, lr=0.0001] Steps: 29%|██▊ | 200/700 [01:26<03:36, 2.31it/s, loss=0.102, lr=0.0001] Steps: 29%|██▊ | 200/700 [01:26<03:36, 2.31it/s, loss=0.152, lr=0.0001] Steps: 29%|██▊ | 201/700 [01:27<03:35, 2.31it/s, loss=0.152, lr=0.0001] Steps: 29%|██▊ | 201/700 [01:27<03:35, 2.31it/s, loss=0.182, lr=0.0001] Steps: 29%|██▉ | 202/700 [01:27<03:35, 2.31it/s, loss=0.182, lr=0.0001] Steps: 29%|██▉ | 202/700 [01:27<03:35, 2.31it/s, loss=0.0467, lr=0.0001] Steps: 29%|██▉ | 203/700 [01:28<03:35, 2.31it/s, loss=0.0467, lr=0.0001] Steps: 29%|██▉ | 203/700 [01:28<03:35, 2.31it/s, loss=0.126, lr=0.0001] Steps: 29%|██▉ | 204/700 [01:28<03:34, 2.31it/s, loss=0.126, lr=0.0001] Steps: 29%|██▉ | 204/700 [01:28<03:34, 2.31it/s, loss=0.0631, lr=0.0001] Steps: 29%|██▉ | 205/700 [01:28<03:35, 2.30it/s, loss=0.0631, lr=0.0001] Steps: 29%|██▉ | 205/700 [01:28<03:35, 2.30it/s, loss=0.0418, lr=0.0001] Steps: 29%|██▉ | 206/700 [01:29<03:34, 2.30it/s, loss=0.0418, lr=0.0001] Steps: 29%|██▉ | 206/700 [01:29<03:34, 2.30it/s, loss=0.133, lr=0.0001] Steps: 30%|██▉ | 207/700 [01:29<03:34, 2.30it/s, loss=0.133, lr=0.0001] Steps: 30%|██▉ | 207/700 [01:29<03:34, 2.30it/s, loss=0.0892, lr=0.0001] Steps: 30%|██▉ | 208/700 [01:30<03:33, 2.31it/s, loss=0.0892, lr=0.0001] Steps: 30%|██▉ | 208/700 [01:30<03:33, 2.31it/s, loss=0.103, lr=0.0001] Steps: 30%|██▉ | 209/700 [01:30<03:32, 2.31it/s, loss=0.103, lr=0.0001] Steps: 30%|██▉ | 209/700 [01:30<03:32, 2.31it/s, loss=0.178, lr=0.0001] Steps: 30%|███ | 210/700 [01:31<03:32, 2.31it/s, loss=0.178, lr=0.0001] Steps: 30%|███ | 210/700 [01:31<03:32, 2.31it/s, loss=0.0359, lr=0.0001] Steps: 30%|███ | 211/700 [01:31<03:31, 2.31it/s, loss=0.0359, lr=0.0001] Steps: 30%|███ | 211/700 [01:31<03:31, 2.31it/s, loss=0.0537, lr=0.0001] Steps: 30%|███ | 212/700 [01:31<03:31, 2.31it/s, loss=0.0537, lr=0.0001] Steps: 30%|███ | 212/700 [01:31<03:31, 2.31it/s, loss=0.0484, lr=0.0001] Steps: 30%|███ | 213/700 [01:32<03:31, 2.31it/s, loss=0.0484, lr=0.0001] Steps: 30%|███ | 213/700 [01:32<03:31, 2.31it/s, loss=0.02, lr=0.0001] Steps: 31%|███ | 214/700 [01:32<03:30, 2.31it/s, loss=0.02, lr=0.0001] Steps: 31%|███ | 214/700 [01:32<03:30, 2.31it/s, loss=0.0563, lr=0.0001] Steps: 31%|███ | 215/700 [01:33<03:29, 2.31it/s, loss=0.0563, lr=0.0001] Steps: 31%|███ | 215/700 [01:33<03:29, 2.31it/s, loss=0.0508, lr=0.0001] Steps: 31%|███ | 216/700 [01:33<03:29, 2.31it/s, loss=0.0508, lr=0.0001] Steps: 31%|███ | 216/700 [01:33<03:29, 2.31it/s, loss=0.0738, lr=0.0001] Steps: 31%|███ | 217/700 [01:34<03:30, 2.30it/s, loss=0.0738, lr=0.0001] Steps: 31%|███ | 217/700 [01:34<03:30, 2.30it/s, loss=0.0832, lr=0.0001] Steps: 31%|███ | 218/700 [01:34<03:29, 2.30it/s, loss=0.0832, lr=0.0001] Steps: 31%|███ | 218/700 [01:34<03:29, 2.30it/s, loss=0.151, lr=0.0001] Steps: 31%|███▏ | 219/700 [01:34<03:28, 2.31it/s, loss=0.151, lr=0.0001] Steps: 31%|███▏ | 219/700 [01:35<03:28, 2.31it/s, loss=0.113, lr=0.0001] Steps: 31%|███▏ | 220/700 [01:35<03:27, 2.31it/s, loss=0.113, lr=0.0001] Steps: 31%|███▏ | 220/700 [01:35<03:27, 2.31it/s, loss=0.074, lr=0.0001] Steps: 32%|███▏ | 221/700 [01:35<03:27, 2.31it/s, loss=0.074, lr=0.0001] Steps: 32%|███▏ | 221/700 [01:35<03:27, 2.31it/s, loss=0.15, lr=0.0001] Steps: 32%|███▏ | 222/700 [01:36<03:26, 2.31it/s, loss=0.15, lr=0.0001] Steps: 32%|███▏ | 222/700 [01:36<03:26, 2.31it/s, loss=0.0893, lr=0.0001] Steps: 32%|███▏ | 223/700 [01:36<03:26, 2.31it/s, loss=0.0893, lr=0.0001] Steps: 32%|███▏ | 223/700 [01:36<03:26, 2.31it/s, loss=0.118, lr=0.0001] Steps: 32%|███▏ | 224/700 [01:37<03:25, 2.31it/s, loss=0.118, lr=0.0001] Steps: 32%|███▏ | 224/700 [01:37<03:25, 2.31it/s, loss=0.156, lr=0.0001] Steps: 32%|███▏ | 225/700 [01:37<03:25, 2.31it/s, loss=0.156, lr=0.0001] Steps: 32%|███▏ | 225/700 [01:37<03:25, 2.31it/s, loss=0.0856, lr=0.0001] Steps: 32%|███▏ | 226/700 [01:38<03:25, 2.31it/s, loss=0.0856, lr=0.0001] Steps: 32%|███▏ | 226/700 [01:38<03:25, 2.31it/s, loss=0.142, lr=0.0001] Steps: 32%|███▏ | 227/700 [01:38<03:24, 2.31it/s, loss=0.142, lr=0.0001] Steps: 32%|███▏ | 227/700 [01:38<03:24, 2.31it/s, loss=0.135, lr=0.0001] Steps: 33%|███▎ | 228/700 [01:38<03:24, 2.31it/s, loss=0.135, lr=0.0001] Steps: 33%|███▎ | 228/700 [01:38<03:24, 2.31it/s, loss=0.0868, lr=0.0001] Steps: 33%|███▎ | 229/700 [01:39<03:24, 2.30it/s, loss=0.0868, lr=0.0001] Steps: 33%|███▎ | 229/700 [01:39<03:24, 2.30it/s, loss=0.0699, lr=0.0001] Steps: 33%|███▎ | 230/700 [01:39<03:23, 2.31it/s, loss=0.0699, lr=0.0001] Steps: 33%|███▎ | 230/700 [01:39<03:23, 2.31it/s, loss=0.111, lr=0.0001] Steps: 33%|███▎ | 231/700 [01:40<03:23, 2.31it/s, loss=0.111, lr=0.0001] Steps: 33%|███▎ | 231/700 [01:40<03:23, 2.31it/s, loss=0.0788, lr=0.0001] Steps: 33%|███▎ | 232/700 [01:40<03:22, 2.31it/s, loss=0.0788, lr=0.0001] Steps: 33%|███▎ | 232/700 [01:40<03:22, 2.31it/s, loss=0.0501, lr=0.0001] Steps: 33%|███▎ | 233/700 [01:41<03:22, 2.31it/s, loss=0.0501, lr=0.0001] Steps: 33%|███▎ | 233/700 [01:41<03:22, 2.31it/s, loss=0.0609, lr=0.0001] Steps: 33%|███▎ | 234/700 [01:41<03:21, 2.31it/s, loss=0.0609, lr=0.0001] Steps: 33%|███▎ | 234/700 [01:41<03:21, 2.31it/s, loss=0.0557, lr=0.0001] Steps: 34%|███▎ | 235/700 [01:41<03:21, 2.31it/s, loss=0.0557, lr=0.0001] Steps: 34%|███▎ | 235/700 [01:41<03:21, 2.31it/s, loss=0.0626, lr=0.0001] Steps: 34%|███▎ | 236/700 [01:42<03:21, 2.31it/s, loss=0.0626, lr=0.0001] Steps: 34%|███▎ | 236/700 [01:42<03:21, 2.31it/s, loss=0.23, lr=0.0001] Steps: 34%|███▍ | 237/700 [01:42<03:20, 2.30it/s, loss=0.23, lr=0.0001] Steps: 34%|███▍ | 237/700 [01:42<03:20, 2.30it/s, loss=0.186, lr=0.0001] Steps: 34%|███▍ | 238/700 [01:43<03:20, 2.31it/s, loss=0.186, lr=0.0001] Steps: 34%|███▍ | 238/700 [01:43<03:20, 2.31it/s, loss=0.067, lr=0.0001] Steps: 34%|███▍ | 239/700 [01:43<03:19, 2.31it/s, loss=0.067, lr=0.0001] Steps: 34%|███▍ | 239/700 [01:43<03:19, 2.31it/s, loss=0.113, lr=0.0001] Steps: 34%|███▍ | 240/700 [01:44<03:19, 2.31it/s, loss=0.113, lr=0.0001] Steps: 34%|███▍ | 240/700 [01:44<03:19, 2.31it/s, loss=0.0939, lr=0.0001] Steps: 34%|███▍ | 241/700 [01:44<03:19, 2.30it/s, loss=0.0939, lr=0.0001] Steps: 34%|███▍ | 241/700 [01:44<03:19, 2.30it/s, loss=0.0754, lr=0.0001] Steps: 35%|███▍ | 242/700 [01:44<03:19, 2.30it/s, loss=0.0754, lr=0.0001] Steps: 35%|███▍ | 242/700 [01:44<03:19, 2.30it/s, loss=0.214, lr=0.0001] Steps: 35%|███▍ | 243/700 [01:45<03:18, 2.30it/s, loss=0.214, lr=0.0001] Steps: 35%|███▍ | 243/700 [01:45<03:18, 2.30it/s, loss=0.096, lr=0.0001] Steps: 35%|███▍ | 244/700 [01:45<03:17, 2.31it/s, loss=0.096, lr=0.0001] Steps: 35%|███▍ | 244/700 [01:45<03:17, 2.31it/s, loss=0.0839, lr=0.0001] Steps: 35%|███▌ | 245/700 [01:46<03:17, 2.31it/s, loss=0.0839, lr=0.0001] Steps: 35%|███▌ | 245/700 [01:46<03:17, 2.31it/s, loss=0.133, lr=0.0001] Steps: 35%|███▌ | 246/700 [01:46<03:16, 2.31it/s, loss=0.133, lr=0.0001] Steps: 35%|███▌ | 246/700 [01:46<03:16, 2.31it/s, loss=0.104, lr=0.0001] Steps: 35%|███▌ | 247/700 [01:47<03:16, 2.31it/s, loss=0.104, lr=0.0001] Steps: 35%|███▌ | 247/700 [01:47<03:16, 2.31it/s, loss=0.0977, lr=0.0001] Steps: 35%|███▌ | 248/700 [01:47<03:15, 2.31it/s, loss=0.0977, lr=0.0001] Steps: 35%|███▌ | 248/700 [01:47<03:15, 2.31it/s, loss=0.164, lr=0.0001] Steps: 36%|███▌ | 249/700 [01:47<03:15, 2.31it/s, loss=0.164, lr=0.0001] Steps: 36%|███▌ | 249/700 [01:48<03:15, 2.31it/s, loss=0.059, lr=0.0001] Steps: 36%|███▌ | 250/700 [01:48<03:14, 2.31it/s, loss=0.059, lr=0.0001] Steps: 36%|███▌ | 250/700 [01:48<03:14, 2.31it/s, loss=0.052, lr=0.0001] Steps: 36%|███▌ | 251/700 [01:48<03:14, 2.31it/s, loss=0.052, lr=0.0001] Steps: 36%|███▌ | 251/700 [01:48<03:14, 2.31it/s, loss=0.115, lr=0.0001] Steps: 36%|███▌ | 252/700 [01:49<03:14, 2.31it/s, loss=0.115, lr=0.0001] Steps: 36%|███▌ | 252/700 [01:49<03:14, 2.31it/s, loss=0.0825, lr=0.0001] Steps: 36%|███▌ | 253/700 [01:49<03:14, 2.30it/s, loss=0.0825, lr=0.0001] Steps: 36%|███▌ | 253/700 [01:49<03:14, 2.30it/s, loss=0.047, lr=0.0001] Steps: 36%|███▋ | 254/700 [01:50<03:13, 2.30it/s, loss=0.047, lr=0.0001] Steps: 36%|███▋ | 254/700 [01:50<03:13, 2.30it/s, loss=0.0716, lr=0.0001] Steps: 36%|███▋ | 255/700 [01:50<03:13, 2.30it/s, loss=0.0716, lr=0.0001] Steps: 36%|███▋ | 255/700 [01:50<03:13, 2.30it/s, loss=0.0739, lr=0.0001] Steps: 37%|███▋ | 256/700 [01:51<03:12, 2.31it/s, loss=0.0739, lr=0.0001] Steps: 37%|███▋ | 256/700 [01:51<03:12, 2.31it/s, loss=0.162, lr=0.0001] Steps: 37%|███▋ | 257/700 [01:51<03:11, 2.31it/s, loss=0.162, lr=0.0001] Steps: 37%|███▋ | 257/700 [01:51<03:11, 2.31it/s, loss=0.101, lr=0.0001] Steps: 37%|███▋ | 258/700 [01:51<03:11, 2.31it/s, loss=0.101, lr=0.0001] Steps: 37%|███▋ | 258/700 [01:51<03:11, 2.31it/s, loss=0.0502, lr=0.0001] Steps: 37%|███▋ | 259/700 [01:52<03:10, 2.31it/s, loss=0.0502, lr=0.0001] Steps: 37%|███▋ | 259/700 [01:52<03:10, 2.31it/s, loss=0.0932, lr=0.0001] Steps: 37%|███▋ | 260/700 [01:52<03:10, 2.31it/s, loss=0.0932, lr=0.0001] Steps: 37%|███▋ | 260/700 [01:52<03:10, 2.31it/s, loss=0.0996, lr=0.0001] Steps: 37%|███▋ | 261/700 [01:53<03:09, 2.31it/s, loss=0.0996, lr=0.0001] Steps: 37%|███▋ | 261/700 [01:53<03:09, 2.31it/s, loss=0.0506, lr=0.0001] Steps: 37%|███▋ | 262/700 [01:53<03:09, 2.31it/s, loss=0.0506, lr=0.0001] Steps: 37%|███▋ | 262/700 [01:53<03:09, 2.31it/s, loss=0.184, lr=0.0001] Steps: 38%|███▊ | 263/700 [01:54<03:09, 2.31it/s, loss=0.184, lr=0.0001] Steps: 38%|███▊ | 263/700 [01:54<03:09, 2.31it/s, loss=0.0982, lr=0.0001] Steps: 38%|███▊ | 264/700 [01:54<03:08, 2.31it/s, loss=0.0982, lr=0.0001] Steps: 38%|███▊ | 264/700 [01:54<03:08, 2.31it/s, loss=0.0981, lr=0.0001] Steps: 38%|███▊ | 265/700 [01:54<03:08, 2.30it/s, loss=0.0981, lr=0.0001] Steps: 38%|███▊ | 265/700 [01:54<03:08, 2.30it/s, loss=0.0722, lr=0.0001] Steps: 38%|███▊ | 266/700 [01:55<03:08, 2.30it/s, loss=0.0722, lr=0.0001] Steps: 38%|███▊ | 266/700 [01:55<03:08, 2.30it/s, loss=0.085, lr=0.0001] Steps: 38%|███▊ | 267/700 [01:55<03:07, 2.31it/s, loss=0.085, lr=0.0001] Steps: 38%|███▊ | 267/700 [01:55<03:07, 2.31it/s, loss=0.0857, lr=0.0001] Steps: 38%|███▊ | 268/700 [01:56<03:07, 2.31it/s, loss=0.0857, lr=0.0001] Steps: 38%|███▊ | 268/700 [01:56<03:07, 2.31it/s, loss=0.0924, lr=0.0001] Steps: 38%|███▊ | 269/700 [01:56<03:06, 2.31it/s, loss=0.0924, lr=0.0001] Steps: 38%|███▊ | 269/700 [01:56<03:06, 2.31it/s, loss=0.0701, lr=0.0001] Steps: 39%|███▊ | 270/700 [01:57<03:06, 2.31it/s, loss=0.0701, lr=0.0001] Steps: 39%|███▊ | 270/700 [01:57<03:06, 2.31it/s, loss=0.0999, lr=0.0001] Steps: 39%|███▊ | 271/700 [01:57<03:05, 2.31it/s, loss=0.0999, lr=0.0001] Steps: 39%|███▊ | 271/700 [01:57<03:05, 2.31it/s, loss=0.106, lr=0.0001] Steps: 39%|███▉ | 272/700 [01:57<03:05, 2.31it/s, loss=0.106, lr=0.0001] Steps: 39%|███▉ | 272/700 [01:57<03:05, 2.31it/s, loss=0.0785, lr=0.0001] Steps: 39%|███▉ | 273/700 [01:58<03:04, 2.31it/s, loss=0.0785, lr=0.0001] Steps: 39%|███▉ | 273/700 [01:58<03:04, 2.31it/s, loss=0.121, lr=0.0001] Steps: 39%|███▉ | 274/700 [01:58<03:04, 2.31it/s, loss=0.121, lr=0.0001] Steps: 39%|███▉ | 274/700 [01:58<03:04, 2.31it/s, loss=0.0753, lr=0.0001] Steps: 39%|███▉ | 275/700 [01:59<03:04, 2.31it/s, loss=0.0753, lr=0.0001] Steps: 39%|███▉ | 275/700 [01:59<03:04, 2.31it/s, loss=0.0554, lr=0.0001] Steps: 39%|███▉ | 276/700 [01:59<03:03, 2.31it/s, loss=0.0554, lr=0.0001] Steps: 39%|███▉ | 276/700 [01:59<03:03, 2.31it/s, loss=0.153, lr=0.0001] Steps: 40%|███▉ | 277/700 [02:00<03:04, 2.30it/s, loss=0.153, lr=0.0001] Steps: 40%|███▉ | 277/700 [02:00<03:04, 2.30it/s, loss=0.117, lr=0.0001] Steps: 40%|███▉ | 278/700 [02:00<03:03, 2.30it/s, loss=0.117, lr=0.0001] Steps: 40%|███▉ | 278/700 [02:00<03:03, 2.30it/s, loss=0.174, lr=0.0001] Steps: 40%|███▉ | 279/700 [02:00<03:02, 2.30it/s, loss=0.174, lr=0.0001] Steps: 40%|███▉ | 279/700 [02:01<03:02, 2.30it/s, loss=0.165, lr=0.0001] Steps: 40%|████ | 280/700 [02:01<03:02, 2.31it/s, loss=0.165, lr=0.0001] Steps: 40%|████ | 280/700 [02:01<03:02, 2.31it/s, loss=0.0458, lr=0.0001] Steps: 40%|████ | 281/700 [02:01<03:01, 2.31it/s, loss=0.0458, lr=0.0001] Steps: 40%|████ | 281/700 [02:01<03:01, 2.31it/s, loss=0.123, lr=0.0001] Steps: 40%|████ | 282/700 [02:02<03:01, 2.31it/s, loss=0.123, lr=0.0001] Steps: 40%|████ | 282/700 [02:02<03:01, 2.31it/s, loss=0.0655, lr=0.0001] Steps: 40%|████ | 283/700 [02:02<03:00, 2.31it/s, loss=0.0655, lr=0.0001] Steps: 40%|████ | 283/700 [02:02<03:00, 2.31it/s, loss=0.173, lr=0.0001] Steps: 41%|████ | 284/700 [02:03<03:00, 2.31it/s, loss=0.173, lr=0.0001] Steps: 41%|████ | 284/700 [02:03<03:00, 2.31it/s, loss=0.0757, lr=0.0001] Steps: 41%|████ | 285/700 [02:03<02:59, 2.31it/s, loss=0.0757, lr=0.0001] Steps: 41%|████ | 285/700 [02:03<02:59, 2.31it/s, loss=0.0679, lr=0.0001] Steps: 41%|████ | 286/700 [02:04<02:59, 2.31it/s, loss=0.0679, lr=0.0001] Steps: 41%|████ | 286/700 [02:04<02:59, 2.31it/s, loss=0.0842, lr=0.0001] Steps: 41%|████ | 287/700 [02:04<02:58, 2.31it/s, loss=0.0842, lr=0.0001] Steps: 41%|████ | 287/700 [02:04<02:58, 2.31it/s, loss=0.0515, lr=0.0001] Steps: 41%|████ | 288/700 [02:04<02:58, 2.31it/s, loss=0.0515, lr=0.0001] Steps: 41%|████ | 288/700 [02:04<02:58, 2.31it/s, loss=0.046, lr=0.0001] Steps: 41%|████▏ | 289/700 [02:05<02:58, 2.30it/s, loss=0.046, lr=0.0001] Steps: 41%|████▏ | 289/700 [02:05<02:58, 2.30it/s, loss=0.0335, lr=0.0001] Steps: 41%|████▏ | 290/700 [02:05<02:57, 2.30it/s, loss=0.0335, lr=0.0001] Steps: 41%|████▏ | 290/700 [02:05<02:57, 2.30it/s, loss=0.249, lr=0.0001] Steps: 42%|████▏ | 291/700 [02:06<02:57, 2.31it/s, loss=0.249, lr=0.0001] Steps: 42%|████▏ | 291/700 [02:06<02:57, 2.31it/s, loss=0.118, lr=0.0001] Steps: 42%|████▏ | 292/700 [02:06<02:56, 2.31it/s, loss=0.118, lr=0.0001] Steps: 42%|████▏ | 292/700 [02:06<02:56, 2.31it/s, loss=0.11, lr=0.0001] Steps: 42%|████▏ | 293/700 [02:07<02:56, 2.31it/s, loss=0.11, lr=0.0001] Steps: 42%|████▏ | 293/700 [02:07<02:56, 2.31it/s, loss=0.166, lr=0.0001] Steps: 42%|████▏ | 294/700 [02:07<02:55, 2.31it/s, loss=0.166, lr=0.0001] Steps: 42%|████▏ | 294/700 [02:07<02:55, 2.31it/s, loss=0.196, lr=0.0001] Steps: 42%|████▏ | 295/700 [02:07<02:55, 2.31it/s, loss=0.196, lr=0.0001] Steps: 42%|████▏ | 295/700 [02:07<02:55, 2.31it/s, loss=0.16, lr=0.0001] Steps: 42%|████▏ | 296/700 [02:08<02:54, 2.31it/s, loss=0.16, lr=0.0001] Steps: 42%|████▏ | 296/700 [02:08<02:54, 2.31it/s, loss=0.125, lr=0.0001] Steps: 42%|████▏ | 297/700 [02:08<02:54, 2.31it/s, loss=0.125, lr=0.0001] Steps: 42%|████▏ | 297/700 [02:08<02:54, 2.31it/s, loss=0.0685, lr=0.0001] Steps: 43%|████▎ | 298/700 [02:09<02:54, 2.31it/s, loss=0.0685, lr=0.0001] Steps: 43%|████▎ | 298/700 [02:09<02:54, 2.31it/s, loss=0.0654, lr=0.0001] Steps: 43%|████▎ | 299/700 [02:09<02:53, 2.31it/s, loss=0.0654, lr=0.0001] Steps: 43%|████▎ | 299/700 [02:09<02:53, 2.31it/s, loss=0.102, lr=0.0001] Steps: 43%|████▎ | 300/700 [02:10<02:53, 2.31it/s, loss=0.102, lr=0.0001] Steps: 43%|████▎ | 300/700 [02:10<02:53, 2.31it/s, loss=0.307, lr=0.0001] Steps: 43%|████▎ | 301/700 [02:10<02:53, 2.30it/s, loss=0.307, lr=0.0001] Steps: 43%|████▎ | 301/700 [02:10<02:53, 2.30it/s, loss=0.0656, lr=0.0001] Steps: 43%|████▎ | 302/700 [02:10<02:52, 2.30it/s, loss=0.0656, lr=0.0001] Steps: 43%|████▎ | 302/700 [02:10<02:52, 2.30it/s, loss=0.13, lr=0.0001] Steps: 43%|████▎ | 303/700 [02:11<02:52, 2.30it/s, loss=0.13, lr=0.0001] Steps: 43%|████▎ | 303/700 [02:11<02:52, 2.30it/s, loss=0.147, lr=0.0001] Steps: 43%|████▎ | 304/700 [02:11<02:51, 2.31it/s, loss=0.147, lr=0.0001] Steps: 43%|████▎ | 304/700 [02:11<02:51, 2.31it/s, loss=0.171, lr=0.0001] Steps: 44%|████▎ | 305/700 [02:12<02:51, 2.31it/s, loss=0.171, lr=0.0001] Steps: 44%|████▎ | 305/700 [02:12<02:51, 2.31it/s, loss=0.0742, lr=0.0001] Steps: 44%|████▎ | 306/700 [02:12<02:50, 2.31it/s, loss=0.0742, lr=0.0001] Steps: 44%|████▎ | 306/700 [02:12<02:50, 2.31it/s, loss=0.208, lr=0.0001] Steps: 44%|████▍ | 307/700 [02:13<02:50, 2.31it/s, loss=0.208, lr=0.0001] Steps: 44%|████▍ | 307/700 [02:13<02:50, 2.31it/s, loss=0.138, lr=0.0001] Steps: 44%|████▍ | 308/700 [02:13<02:49, 2.31it/s, loss=0.138, lr=0.0001] Steps: 44%|████▍ | 308/700 [02:13<02:49, 2.31it/s, loss=0.0506, lr=0.0001] Steps: 44%|████▍ | 309/700 [02:13<02:49, 2.31it/s, loss=0.0506, lr=0.0001] Steps: 44%|████▍ | 309/700 [02:14<02:49, 2.31it/s, loss=0.0898, lr=0.0001] Steps: 44%|████▍ | 310/700 [02:14<02:48, 2.31it/s, loss=0.0898, lr=0.0001] Steps: 44%|████▍ | 310/700 [02:14<02:48, 2.31it/s, loss=0.157, lr=0.0001] Steps: 44%|████▍ | 311/700 [02:14<02:48, 2.31it/s, loss=0.157, lr=0.0001] Steps: 44%|████▍ | 311/700 [02:14<02:48, 2.31it/s, loss=0.13, lr=0.0001] Steps: 45%|████▍ | 312/700 [02:15<02:47, 2.31it/s, loss=0.13, lr=0.0001] Steps: 45%|████▍ | 312/700 [02:15<02:47, 2.31it/s, loss=0.104, lr=0.0001] Steps: 45%|████▍ | 313/700 [02:15<02:48, 2.30it/s, loss=0.104, lr=0.0001] Steps: 45%|████▍ | 313/700 [02:15<02:48, 2.30it/s, loss=0.0702, lr=0.0001] Steps: 45%|████▍ | 314/700 [02:16<02:47, 2.30it/s, loss=0.0702, lr=0.0001] Steps: 45%|████▍ | 314/700 [02:16<02:47, 2.30it/s, loss=0.0639, lr=0.0001] Steps: 45%|████▌ | 315/700 [02:16<02:47, 2.31it/s, loss=0.0639, lr=0.0001] Steps: 45%|████▌ | 315/700 [02:16<02:47, 2.31it/s, loss=0.0803, lr=0.0001] Steps: 45%|████▌ | 316/700 [02:17<02:46, 2.31it/s, loss=0.0803, lr=0.0001] Steps: 45%|████▌ | 316/700 [02:17<02:46, 2.31it/s, loss=0.0989, lr=0.0001] Steps: 45%|████▌ | 317/700 [02:17<02:45, 2.31it/s, loss=0.0989, lr=0.0001] Steps: 45%|████▌ | 317/700 [02:17<02:45, 2.31it/s, loss=0.0508, lr=0.0001] Steps: 45%|████▌ | 318/700 [02:17<02:45, 2.31it/s, loss=0.0508, lr=0.0001] Steps: 45%|████▌ | 318/700 [02:17<02:45, 2.31it/s, loss=0.0966, lr=0.0001] Steps: 46%|████▌ | 319/700 [02:18<02:44, 2.31it/s, loss=0.0966, lr=0.0001] Steps: 46%|████▌ | 319/700 [02:18<02:44, 2.31it/s, loss=0.186, lr=0.0001] Steps: 46%|████▌ | 320/700 [02:18<02:44, 2.31it/s, loss=0.186, lr=0.0001] Steps: 46%|████▌ | 320/700 [02:18<02:44, 2.31it/s, loss=0.113, lr=0.0001] Steps: 46%|████▌ | 321/700 [02:19<02:44, 2.31it/s, loss=0.113, lr=0.0001] Steps: 46%|████▌ | 321/700 [02:19<02:44, 2.31it/s, loss=0.075, lr=0.0001] Steps: 46%|████▌ | 322/700 [02:19<02:43, 2.31it/s, loss=0.075, lr=0.0001] Steps: 46%|████▌ | 322/700 [02:19<02:43, 2.31it/s, loss=0.1, lr=0.0001] Steps: 46%|████▌ | 323/700 [02:20<02:43, 2.31it/s, loss=0.1, lr=0.0001] Steps: 46%|████▌ | 323/700 [02:20<02:43, 2.31it/s, loss=0.126, lr=0.0001] Steps: 46%|████▋ | 324/700 [02:20<02:42, 2.31it/s, loss=0.126, lr=0.0001] Steps: 46%|████▋ | 324/700 [02:20<02:42, 2.31it/s, loss=0.0512, lr=0.0001] Steps: 46%|████▋ | 325/700 [02:20<02:43, 2.30it/s, loss=0.0512, lr=0.0001] Steps: 46%|████▋ | 325/700 [02:20<02:43, 2.30it/s, loss=0.139, lr=0.0001] Steps: 47%|████▋ | 326/700 [02:21<02:42, 2.30it/s, loss=0.139, lr=0.0001] Steps: 47%|████▋ | 326/700 [02:21<02:42, 2.30it/s, loss=0.123, lr=0.0001] Steps: 47%|████▋ | 327/700 [02:21<02:41, 2.30it/s, loss=0.123, lr=0.0001] Steps: 47%|████▋ | 327/700 [02:21<02:41, 2.30it/s, loss=0.124, lr=0.0001] Steps: 47%|████▋ | 328/700 [02:22<02:41, 2.31it/s, loss=0.124, lr=0.0001] Steps: 47%|████▋ | 328/700 [02:22<02:41, 2.31it/s, loss=0.0366, lr=0.0001] Steps: 47%|████▋ | 329/700 [02:22<02:41, 2.30it/s, loss=0.0366, lr=0.0001] Steps: 47%|████▋ | 329/700 [02:22<02:41, 2.30it/s, loss=0.0412, lr=0.0001] Steps: 47%|████▋ | 330/700 [02:23<02:40, 2.30it/s, loss=0.0412, lr=0.0001] Steps: 47%|████▋ | 330/700 [02:23<02:40, 2.30it/s, loss=0.0898, lr=0.0001] Steps: 47%|████▋ | 331/700 [02:23<02:40, 2.31it/s, loss=0.0898, lr=0.0001] Steps: 47%|████▋ | 331/700 [02:23<02:40, 2.31it/s, loss=0.127, lr=0.0001] Steps: 47%|████▋ | 332/700 [02:23<02:39, 2.31it/s, loss=0.127, lr=0.0001] Steps: 47%|████▋ | 332/700 [02:23<02:39, 2.31it/s, loss=0.103, lr=0.0001] Steps: 48%|████▊ | 333/700 [02:24<02:39, 2.30it/s, loss=0.103, lr=0.0001] Steps: 48%|████▊ | 333/700 [02:24<02:39, 2.30it/s, loss=0.134, lr=0.0001] Steps: 48%|████▊ | 334/700 [02:24<02:38, 2.30it/s, loss=0.134, lr=0.0001] Steps: 48%|████▊ | 334/700 [02:24<02:38, 2.30it/s, loss=0.142, lr=0.0001] Steps: 48%|████▊ | 335/700 [02:25<02:38, 2.30it/s, loss=0.142, lr=0.0001] Steps: 48%|████▊ | 335/700 [02:25<02:38, 2.30it/s, loss=0.0705, lr=0.0001] Steps: 48%|████▊ | 336/700 [02:25<02:37, 2.31it/s, loss=0.0705, lr=0.0001] Steps: 48%|████▊ | 336/700 [02:25<02:37, 2.31it/s, loss=0.0656, lr=0.0001] Steps: 48%|████▊ | 337/700 [02:26<02:38, 2.30it/s, loss=0.0656, lr=0.0001] Steps: 48%|████▊ | 337/700 [02:26<02:38, 2.30it/s, loss=0.259, lr=0.0001] Steps: 48%|████▊ | 338/700 [02:26<02:37, 2.30it/s, loss=0.259, lr=0.0001] Steps: 48%|████▊ | 338/700 [02:26<02:37, 2.30it/s, loss=0.058, lr=0.0001] Steps: 48%|████▊ | 339/700 [02:27<02:36, 2.30it/s, loss=0.058, lr=0.0001] Steps: 48%|████▊ | 339/700 [02:27<02:36, 2.30it/s, loss=0.0758, lr=0.0001] Steps: 49%|████▊ | 340/700 [02:27<02:36, 2.30it/s, loss=0.0758, lr=0.0001] Steps: 49%|████▊ | 340/700 [02:27<02:36, 2.30it/s, loss=0.151, lr=0.0001] Steps: 49%|████▊ | 341/700 [02:27<02:35, 2.30it/s, loss=0.151, lr=0.0001] Steps: 49%|████▊ | 341/700 [02:27<02:35, 2.30it/s, loss=0.0809, lr=0.0001] Steps: 49%|████▉ | 342/700 [02:28<02:35, 2.30it/s, loss=0.0809, lr=0.0001] Steps: 49%|████▉ | 342/700 [02:28<02:35, 2.30it/s, loss=0.0832, lr=0.0001] Steps: 49%|████▉ | 343/700 [02:28<02:34, 2.31it/s, loss=0.0832, lr=0.0001] Steps: 49%|████▉ | 343/700 [02:28<02:34, 2.31it/s, loss=0.0567, lr=0.0001] Steps: 49%|████▉ | 344/700 [02:29<02:34, 2.30it/s, loss=0.0567, lr=0.0001] Steps: 49%|████▉ | 344/700 [02:29<02:34, 2.30it/s, loss=0.192, lr=0.0001] Steps: 49%|████▉ | 345/700 [02:29<02:34, 2.30it/s, loss=0.192, lr=0.0001] Steps: 49%|████▉ | 345/700 [02:29<02:34, 2.30it/s, loss=0.147, lr=0.0001] Steps: 49%|████▉ | 346/700 [02:30<02:33, 2.31it/s, loss=0.147, lr=0.0001] Steps: 49%|████▉ | 346/700 [02:30<02:33, 2.31it/s, loss=0.0729, lr=0.0001] Steps: 50%|████▉ | 347/700 [02:30<02:33, 2.31it/s, loss=0.0729, lr=0.0001] Steps: 50%|████▉ | 347/700 [02:30<02:33, 2.31it/s, loss=0.0998, lr=0.0001] Steps: 50%|████▉ | 348/700 [02:30<02:32, 2.31it/s, loss=0.0998, lr=0.0001] Steps: 50%|████▉ | 348/700 [02:30<02:32, 2.31it/s, loss=0.132, lr=0.0001] Steps: 50%|████▉ | 349/700 [02:31<02:32, 2.30it/s, loss=0.132, lr=0.0001] Steps: 50%|████▉ | 349/700 [02:31<02:32, 2.30it/s, loss=0.0276, lr=0.0001] Steps: 50%|█████ | 350/700 [02:31<02:32, 2.30it/s, loss=0.0276, lr=0.0001] Steps: 50%|█████ | 350/700 [02:31<02:32, 2.30it/s, loss=0.198, lr=0.0001] Steps: 50%|█████ | 351/700 [02:32<02:31, 2.30it/s, loss=0.198, lr=0.0001] Steps: 50%|█████ | 351/700 [02:32<02:31, 2.30it/s, loss=0.135, lr=0.0001] Steps: 50%|█████ | 352/700 [02:32<02:30, 2.31it/s, loss=0.135, lr=0.0001] Steps: 50%|█████ | 352/700 [02:32<02:30, 2.31it/s, loss=0.0165, lr=0.0001] Steps: 50%|█████ | 353/700 [02:33<02:30, 2.31it/s, loss=0.0165, lr=0.0001] Steps: 50%|█████ | 353/700 [02:33<02:30, 2.31it/s, loss=0.0565, lr=0.0001] Steps: 51%|█████ | 354/700 [02:33<02:29, 2.31it/s, loss=0.0565, lr=0.0001] Steps: 51%|█████ | 354/700 [02:33<02:29, 2.31it/s, loss=0.12, lr=0.0001] Steps: 51%|█████ | 355/700 [02:33<02:29, 2.31it/s, loss=0.12, lr=0.0001] Steps: 51%|█████ | 355/700 [02:33<02:29, 2.31it/s, loss=0.104, lr=0.0001] Steps: 51%|█████ | 356/700 [02:34<02:29, 2.31it/s, loss=0.104, lr=0.0001] Steps: 51%|█████ | 356/700 [02:34<02:29, 2.31it/s, loss=0.0892, lr=0.0001] Steps: 51%|█████ | 357/700 [02:34<02:28, 2.31it/s, loss=0.0892, lr=0.0001] Steps: 51%|█████ | 357/700 [02:34<02:28, 2.31it/s, loss=0.181, lr=0.0001] Steps: 51%|█████ | 358/700 [02:35<02:28, 2.31it/s, loss=0.181, lr=0.0001] Steps: 51%|█████ | 358/700 [02:35<02:28, 2.31it/s, loss=0.0601, lr=0.0001] Steps: 51%|█████▏ | 359/700 [02:35<02:27, 2.31it/s, loss=0.0601, lr=0.0001] Steps: 51%|█████▏ | 359/700 [02:35<02:27, 2.31it/s, loss=0.124, lr=0.0001] Steps: 51%|█████▏ | 360/700 [02:36<02:27, 2.31it/s, loss=0.124, lr=0.0001] Steps: 51%|█████▏ | 360/700 [02:36<02:27, 2.31it/s, loss=0.0831, lr=0.0001] Steps: 52%|█████▏ | 361/700 [02:36<02:27, 2.30it/s, loss=0.0831, lr=0.0001] Steps: 52%|█████▏ | 361/700 [02:36<02:27, 2.30it/s, loss=0.0764, lr=0.0001] Steps: 52%|█████▏ | 362/700 [02:36<02:26, 2.30it/s, loss=0.0764, lr=0.0001] Steps: 52%|█████▏ | 362/700 [02:37<02:26, 2.30it/s, loss=0.189, lr=0.0001] Steps: 52%|█████▏ | 363/700 [02:37<02:26, 2.30it/s, loss=0.189, lr=0.0001] Steps: 52%|█████▏ | 363/700 [02:37<02:26, 2.30it/s, loss=0.0764, lr=0.0001] Steps: 52%|█████▏ | 364/700 [02:37<02:25, 2.31it/s, loss=0.0764, lr=0.0001] Steps: 52%|█████▏ | 364/700 [02:37<02:25, 2.31it/s, loss=0.131, lr=0.0001] Steps: 52%|█████▏ | 365/700 [02:38<02:25, 2.31it/s, loss=0.131, lr=0.0001] Steps: 52%|█████▏ | 365/700 [02:38<02:25, 2.31it/s, loss=0.0874, lr=0.0001] Steps: 52%|█████▏ | 366/700 [02:38<02:24, 2.31it/s, loss=0.0874, lr=0.0001] Steps: 52%|█████▏ | 366/700 [02:38<02:24, 2.31it/s, loss=0.183, lr=0.0001] Steps: 52%|█████▏ | 367/700 [02:39<02:24, 2.31it/s, loss=0.183, lr=0.0001] Steps: 52%|█████▏ | 367/700 [02:39<02:24, 2.31it/s, loss=0.0662, lr=0.0001] Steps: 53%|█████▎ | 368/700 [02:39<02:23, 2.31it/s, loss=0.0662, lr=0.0001] Steps: 53%|█████▎ | 368/700 [02:39<02:23, 2.31it/s, loss=0.112, lr=0.0001] Steps: 53%|█████▎ | 369/700 [02:40<02:23, 2.31it/s, loss=0.112, lr=0.0001] Steps: 53%|█████▎ | 369/700 [02:40<02:23, 2.31it/s, loss=0.0877, lr=0.0001] Steps: 53%|█████▎ | 370/700 [02:40<02:22, 2.31it/s, loss=0.0877, lr=0.0001] Steps: 53%|█████▎ | 370/700 [02:40<02:22, 2.31it/s, loss=0.0994, lr=0.0001] Steps: 53%|█████▎ | 371/700 [02:40<02:22, 2.31it/s, loss=0.0994, lr=0.0001] Steps: 53%|█████▎ | 371/700 [02:40<02:22, 2.31it/s, loss=0.116, lr=0.0001] Steps: 53%|█████▎ | 372/700 [02:41<02:21, 2.31it/s, loss=0.116, lr=0.0001] Steps: 53%|█████▎ | 372/700 [02:41<02:21, 2.31it/s, loss=0.0953, lr=0.0001] Steps: 53%|█████▎ | 373/700 [02:41<02:22, 2.30it/s, loss=0.0953, lr=0.0001] Steps: 53%|█████▎ | 373/700 [02:41<02:22, 2.30it/s, loss=0.129, lr=0.0001] Steps: 53%|█████▎ | 374/700 [02:42<02:21, 2.31it/s, loss=0.129, lr=0.0001] Steps: 53%|█████▎ | 374/700 [02:42<02:21, 2.31it/s, loss=0.106, lr=0.0001] Steps: 54%|█████▎ | 375/700 [02:42<02:20, 2.31it/s, loss=0.106, lr=0.0001] Steps: 54%|█████▎ | 375/700 [02:42<02:20, 2.31it/s, loss=0.0471, lr=0.0001] Steps: 54%|█████▎ | 376/700 [02:43<02:20, 2.31it/s, loss=0.0471, lr=0.0001] Steps: 54%|█████▎ | 376/700 [02:43<02:20, 2.31it/s, loss=0.0695, lr=0.0001] Steps: 54%|█████▍ | 377/700 [02:43<02:19, 2.31it/s, loss=0.0695, lr=0.0001] Steps: 54%|█████▍ | 377/700 [02:43<02:19, 2.31it/s, loss=0.078, lr=0.0001] Steps: 54%|█████▍ | 378/700 [02:43<02:19, 2.31it/s, loss=0.078, lr=0.0001] Steps: 54%|█████▍ | 378/700 [02:43<02:19, 2.31it/s, loss=0.115, lr=0.0001] Steps: 54%|█████▍ | 379/700 [02:44<02:18, 2.31it/s, loss=0.115, lr=0.0001] Steps: 54%|█████▍ | 379/700 [02:44<02:18, 2.31it/s, loss=0.0692, lr=0.0001] Steps: 54%|█████▍ | 380/700 [02:44<02:18, 2.31it/s, loss=0.0692, lr=0.0001] Steps: 54%|█████▍ | 380/700 [02:44<02:18, 2.31it/s, loss=0.0489, lr=0.0001] Steps: 54%|█████▍ | 381/700 [02:45<02:17, 2.31it/s, loss=0.0489, lr=0.0001] Steps: 54%|█████▍ | 381/700 [02:45<02:17, 2.31it/s, loss=0.168, lr=0.0001] Steps: 55%|█████▍ | 382/700 [02:45<02:17, 2.31it/s, loss=0.168, lr=0.0001] Steps: 55%|█████▍ | 382/700 [02:45<02:17, 2.31it/s, loss=0.0656, lr=0.0001] Steps: 55%|█████▍ | 383/700 [02:46<02:27, 2.15it/s, loss=0.0656, lr=0.0001] Steps: 55%|█████▍ | 383/700 [02:46<02:27, 2.15it/s, loss=0.209, lr=0.0001] Steps: 55%|█████▍ | 384/700 [02:46<02:23, 2.20it/s, loss=0.209, lr=0.0001] Steps: 55%|█████▍ | 384/700 [02:46<02:23, 2.20it/s, loss=0.134, lr=0.0001] Steps: 55%|█████▌ | 385/700 [02:47<02:21, 2.22it/s, loss=0.134, lr=0.0001] Steps: 55%|█████▌ | 385/700 [02:47<02:21, 2.22it/s, loss=0.114, lr=0.0001] Steps: 55%|█████▌ | 386/700 [02:47<02:19, 2.25it/s, loss=0.114, lr=0.0001] Steps: 55%|█████▌ | 386/700 [02:47<02:19, 2.25it/s, loss=0.109, lr=0.0001] Steps: 55%|█████▌ | 387/700 [02:47<02:18, 2.26it/s, loss=0.109, lr=0.0001] Steps: 55%|█████▌ | 387/700 [02:47<02:18, 2.26it/s, loss=0.0913, lr=0.0001] Steps: 55%|█████▌ | 388/700 [02:48<02:16, 2.28it/s, loss=0.0913, lr=0.0001] Steps: 55%|█████▌ | 388/700 [02:48<02:16, 2.28it/s, loss=0.0507, lr=0.0001] Steps: 56%|█████▌ | 389/700 [02:48<02:16, 2.28it/s, loss=0.0507, lr=0.0001] Steps: 56%|█████▌ | 389/700 [02:48<02:16, 2.28it/s, loss=0.221, lr=0.0001] Steps: 56%|█████▌ | 390/700 [02:49<02:15, 2.29it/s, loss=0.221, lr=0.0001] Steps: 56%|█████▌ | 390/700 [02:49<02:15, 2.29it/s, loss=0.0575, lr=0.0001] Steps: 56%|█████▌ | 391/700 [02:49<02:14, 2.30it/s, loss=0.0575, lr=0.0001] Steps: 56%|█████▌ | 391/700 [02:49<02:14, 2.30it/s, loss=0.0787, lr=0.0001] Steps: 56%|█████▌ | 392/700 [02:50<02:13, 2.30it/s, loss=0.0787, lr=0.0001] Steps: 56%|█████▌ | 392/700 [02:50<02:13, 2.30it/s, loss=0.121, lr=0.0001] Steps: 56%|█████▌ | 393/700 [02:50<02:13, 2.30it/s, loss=0.121, lr=0.0001] Steps: 56%|█████▌ | 393/700 [02:50<02:13, 2.30it/s, loss=0.0559, lr=0.0001] Steps: 56%|█████▋ | 394/700 [02:50<02:12, 2.31it/s, loss=0.0559, lr=0.0001] Steps: 56%|█████▋ | 394/700 [02:50<02:12, 2.31it/s, loss=0.0453, lr=0.0001] Steps: 56%|█████▋ | 395/700 [02:51<02:12, 2.31it/s, loss=0.0453, lr=0.0001] Steps: 56%|█████▋ | 395/700 [02:51<02:12, 2.31it/s, loss=0.0741, lr=0.0001] Steps: 57%|█████▋ | 396/700 [02:51<02:11, 2.31it/s, loss=0.0741, lr=0.0001] Steps: 57%|█████▋ | 396/700 [02:51<02:11, 2.31it/s, loss=0.138, lr=0.0001] Steps: 57%|█████▋ | 397/700 [02:52<02:12, 2.30it/s, loss=0.138, lr=0.0001] Steps: 57%|█████▋ | 397/700 [02:52<02:12, 2.30it/s, loss=0.0937, lr=0.0001] Steps: 57%|█████▋ | 398/700 [02:52<02:11, 2.30it/s, loss=0.0937, lr=0.0001] Steps: 57%|█████▋ | 398/700 [02:52<02:11, 2.30it/s, loss=0.0666, lr=0.0001] Steps: 57%|█████▋ | 399/700 [02:53<02:10, 2.30it/s, loss=0.0666, lr=0.0001] Steps: 57%|█████▋ | 399/700 [02:53<02:10, 2.30it/s, loss=0.0977, lr=0.0001] Steps: 57%|█████▋ | 400/700 [02:53<02:10, 2.31it/s, loss=0.0977, lr=0.0001] Steps: 57%|█████▋ | 400/700 [02:53<02:10, 2.31it/s, loss=0.133, lr=0.0001] Steps: 57%|█████▋ | 401/700 [02:53<02:09, 2.31it/s, loss=0.133, lr=0.0001] Steps: 57%|█████▋ | 401/700 [02:54<02:09, 2.31it/s, loss=0.0634, lr=0.0001] Steps: 57%|█████▋ | 402/700 [02:54<02:09, 2.31it/s, loss=0.0634, lr=0.0001] Steps: 57%|█████▋ | 402/700 [02:54<02:09, 2.31it/s, loss=0.0826, lr=0.0001] Steps: 58%|█████▊ | 403/700 [02:54<02:08, 2.31it/s, loss=0.0826, lr=0.0001] Steps: 58%|█████▊ | 403/700 [02:54<02:08, 2.31it/s, loss=0.0451, lr=0.0001] Steps: 58%|█████▊ | 404/700 [02:55<02:08, 2.31it/s, loss=0.0451, lr=0.0001] Steps: 58%|█████▊ | 404/700 [02:55<02:08, 2.31it/s, loss=0.146, lr=0.0001] Steps: 58%|█████▊ | 405/700 [02:55<02:07, 2.31it/s, loss=0.146, lr=0.0001] Steps: 58%|█████▊ | 405/700 [02:55<02:07, 2.31it/s, loss=0.127, lr=0.0001] Steps: 58%|█████▊ | 406/700 [02:56<02:07, 2.31it/s, loss=0.127, lr=0.0001] Steps: 58%|█████▊ | 406/700 [02:56<02:07, 2.31it/s, loss=0.11, lr=0.0001] Steps: 58%|█████▊ | 407/700 [02:56<02:06, 2.31it/s, loss=0.11, lr=0.0001] Steps: 58%|█████▊ | 407/700 [02:56<02:06, 2.31it/s, loss=0.0996, lr=0.0001] Steps: 58%|█████▊ | 408/700 [02:57<02:06, 2.31it/s, loss=0.0996, lr=0.0001] Steps: 58%|█████▊ | 408/700 [02:57<02:06, 2.31it/s, loss=0.136, lr=0.0001] Steps: 58%|█████▊ | 409/700 [02:57<02:06, 2.30it/s, loss=0.136, lr=0.0001] Steps: 58%|█████▊ | 409/700 [02:57<02:06, 2.30it/s, loss=0.174, lr=0.0001] Steps: 59%|█████▊ | 410/700 [02:57<02:05, 2.30it/s, loss=0.174, lr=0.0001] Steps: 59%|█████▊ | 410/700 [02:57<02:05, 2.30it/s, loss=0.106, lr=0.0001] Steps: 59%|█████▊ | 411/700 [02:58<02:05, 2.30it/s, loss=0.106, lr=0.0001] Steps: 59%|█████▊ | 411/700 [02:58<02:05, 2.30it/s, loss=0.137, lr=0.0001] Steps: 59%|█████▉ | 412/700 [02:58<02:04, 2.31it/s, loss=0.137, lr=0.0001] Steps: 59%|█████▉ | 412/700 [02:58<02:04, 2.31it/s, loss=0.0351, lr=0.0001] Steps: 59%|█████▉ | 413/700 [02:59<02:04, 2.31it/s, loss=0.0351, lr=0.0001] Steps: 59%|█████▉ | 413/700 [02:59<02:04, 2.31it/s, loss=0.136, lr=0.0001] Steps: 59%|█████▉ | 414/700 [02:59<02:03, 2.31it/s, loss=0.136, lr=0.0001] Steps: 59%|█████▉ | 414/700 [02:59<02:03, 2.31it/s, loss=0.0681, lr=0.0001] Steps: 59%|█████▉ | 415/700 [03:00<02:03, 2.31it/s, loss=0.0681, lr=0.0001] Steps: 59%|█████▉ | 415/700 [03:00<02:03, 2.31it/s, loss=0.0218, lr=0.0001] Steps: 59%|█████▉ | 416/700 [03:00<02:02, 2.31it/s, loss=0.0218, lr=0.0001] Steps: 59%|█████▉ | 416/700 [03:00<02:02, 2.31it/s, loss=0.0585, lr=0.0001] Steps: 60%|█████▉ | 417/700 [03:00<02:02, 2.31it/s, loss=0.0585, lr=0.0001] Steps: 60%|█████▉ | 417/700 [03:00<02:02, 2.31it/s, loss=0.0662, lr=0.0001] Steps: 60%|█████▉ | 418/700 [03:01<02:02, 2.31it/s, loss=0.0662, lr=0.0001] Steps: 60%|█████▉ | 418/700 [03:01<02:02, 2.31it/s, loss=0.0406, lr=0.0001] Steps: 60%|█████▉ | 419/700 [03:01<02:01, 2.31it/s, loss=0.0406, lr=0.0001] Steps: 60%|█████▉ | 419/700 [03:01<02:01, 2.31it/s, loss=0.0997, lr=0.0001] Steps: 60%|██████ | 420/700 [03:02<02:01, 2.31it/s, loss=0.0997, lr=0.0001] Steps: 60%|██████ | 420/700 [03:02<02:01, 2.31it/s, loss=0.13, lr=0.0001] Steps: 60%|██████ | 421/700 [03:02<02:01, 2.30it/s, loss=0.13, lr=0.0001] Steps: 60%|██████ | 421/700 [03:02<02:01, 2.30it/s, loss=0.0885, lr=0.0001] Steps: 60%|██████ | 422/700 [03:03<02:00, 2.30it/s, loss=0.0885, lr=0.0001] Steps: 60%|██████ | 422/700 [03:03<02:00, 2.30it/s, loss=0.0947, lr=0.0001] Steps: 60%|██████ | 423/700 [03:03<02:00, 2.31it/s, loss=0.0947, lr=0.0001] Steps: 60%|██████ | 423/700 [03:03<02:00, 2.31it/s, loss=0.0685, lr=0.0001] Steps: 61%|██████ | 424/700 [03:03<01:59, 2.31it/s, loss=0.0685, lr=0.0001] Steps: 61%|██████ | 424/700 [03:03<01:59, 2.31it/s, loss=0.0901, lr=0.0001] Steps: 61%|██████ | 425/700 [03:04<01:59, 2.31it/s, loss=0.0901, lr=0.0001] Steps: 61%|██████ | 425/700 [03:04<01:59, 2.31it/s, loss=0.233, lr=0.0001] Steps: 61%|██████ | 426/700 [03:04<01:58, 2.31it/s, loss=0.233, lr=0.0001] Steps: 61%|██████ | 426/700 [03:04<01:58, 2.31it/s, loss=0.0955, lr=0.0001] Steps: 61%|██████ | 427/700 [03:05<01:58, 2.31it/s, loss=0.0955, lr=0.0001] Steps: 61%|██████ | 427/700 [03:05<01:58, 2.31it/s, loss=0.174, lr=0.0001] Steps: 61%|██████ | 428/700 [03:05<01:57, 2.31it/s, loss=0.174, lr=0.0001] Steps: 61%|██████ | 428/700 [03:05<01:57, 2.31it/s, loss=0.0519, lr=0.0001] Steps: 61%|██████▏ | 429/700 [03:06<01:57, 2.31it/s, loss=0.0519, lr=0.0001] Steps: 61%|██████▏ | 429/700 [03:06<01:57, 2.31it/s, loss=0.0831, lr=0.0001] Steps: 61%|██████▏ | 430/700 [03:06<01:56, 2.31it/s, loss=0.0831, lr=0.0001] Steps: 61%|██████▏ | 430/700 [03:06<01:56, 2.31it/s, loss=0.117, lr=0.0001] Steps: 62%|██████▏ | 431/700 [03:06<01:56, 2.31it/s, loss=0.117, lr=0.0001] Steps: 62%|██████▏ | 431/700 [03:07<01:56, 2.31it/s, loss=0.149, lr=0.0001] Steps: 62%|██████▏ | 432/700 [03:07<01:55, 2.31it/s, loss=0.149, lr=0.0001] Steps: 62%|██████▏ | 432/700 [03:07<01:55, 2.31it/s, loss=0.0795, lr=0.0001] Steps: 62%|██████▏ | 433/700 [03:07<01:56, 2.30it/s, loss=0.0795, lr=0.0001] Steps: 62%|██████▏ | 433/700 [03:07<01:56, 2.30it/s, loss=0.107, lr=0.0001] Steps: 62%|██████▏ | 434/700 [03:08<01:55, 2.30it/s, loss=0.107, lr=0.0001] Steps: 62%|██████▏ | 434/700 [03:08<01:55, 2.30it/s, loss=0.0928, lr=0.0001] Steps: 62%|██████▏ | 435/700 [03:08<01:54, 2.31it/s, loss=0.0928, lr=0.0001] Steps: 62%|██████▏ | 435/700 [03:08<01:54, 2.31it/s, loss=0.0883, lr=0.0001] Steps: 62%|██████▏ | 436/700 [03:09<01:54, 2.31it/s, loss=0.0883, lr=0.0001] Steps: 62%|██████▏ | 436/700 [03:09<01:54, 2.31it/s, loss=0.0757, lr=0.0001] Steps: 62%|██████▏ | 437/700 [03:09<01:53, 2.31it/s, loss=0.0757, lr=0.0001] Steps: 62%|██████▏ | 437/700 [03:09<01:53, 2.31it/s, loss=0.119, lr=0.0001] Steps: 63%|██████▎ | 438/700 [03:10<01:53, 2.31it/s, loss=0.119, lr=0.0001] Steps: 63%|██████▎ | 438/700 [03:10<01:53, 2.31it/s, loss=0.126, lr=0.0001] Steps: 63%|██████▎ | 439/700 [03:10<01:53, 2.31it/s, loss=0.126, lr=0.0001] Steps: 63%|██████▎ | 439/700 [03:10<01:53, 2.31it/s, loss=0.0425, lr=0.0001] Steps: 63%|██████▎ | 440/700 [03:10<01:52, 2.31it/s, loss=0.0425, lr=0.0001] Steps: 63%|██████▎ | 440/700 [03:10<01:52, 2.31it/s, loss=0.143, lr=0.0001] Steps: 63%|██████▎ | 441/700 [03:11<01:52, 2.31it/s, loss=0.143, lr=0.0001] Steps: 63%|██████▎ | 441/700 [03:11<01:52, 2.31it/s, loss=0.0846, lr=0.0001] Steps: 63%|██████▎ | 442/700 [03:11<01:51, 2.31it/s, loss=0.0846, lr=0.0001] Steps: 63%|██████▎ | 442/700 [03:11<01:51, 2.31it/s, loss=0.0804, lr=0.0001] Steps: 63%|██████▎ | 443/700 [03:12<01:51, 2.31it/s, loss=0.0804, lr=0.0001] Steps: 63%|██████▎ | 443/700 [03:12<01:51, 2.31it/s, loss=0.139, lr=0.0001] Steps: 63%|██████▎ | 444/700 [03:12<01:50, 2.31it/s, loss=0.139, lr=0.0001] Steps: 63%|██████▎ | 444/700 [03:12<01:50, 2.31it/s, loss=0.115, lr=0.0001] Steps: 64%|██████▎ | 445/700 [03:13<01:50, 2.30it/s, loss=0.115, lr=0.0001] Steps: 64%|██████▎ | 445/700 [03:13<01:50, 2.30it/s, loss=0.0897, lr=0.0001] Steps: 64%|██████▎ | 446/700 [03:13<01:50, 2.30it/s, loss=0.0897, lr=0.0001] Steps: 64%|██████▎ | 446/700 [03:13<01:50, 2.30it/s, loss=0.0656, lr=0.0001] Steps: 64%|██████▍ | 447/700 [03:13<01:49, 2.31it/s, loss=0.0656, lr=0.0001] Steps: 64%|██████▍ | 447/700 [03:13<01:49, 2.31it/s, loss=0.0926, lr=0.0001] Steps: 64%|██████▍ | 448/700 [03:14<01:49, 2.31it/s, loss=0.0926, lr=0.0001] Steps: 64%|██████▍ | 448/700 [03:14<01:49, 2.31it/s, loss=0.0764, lr=0.0001] Steps: 64%|██████▍ | 449/700 [03:14<01:48, 2.31it/s, loss=0.0764, lr=0.0001] Steps: 64%|██████▍ | 449/700 [03:14<01:48, 2.31it/s, loss=0.0648, lr=0.0001] Steps: 64%|██████▍ | 450/700 [03:15<01:48, 2.31it/s, loss=0.0648, lr=0.0001] Steps: 64%|██████▍ | 450/700 [03:15<01:48, 2.31it/s, loss=0.0487, lr=0.0001] Steps: 64%|██████▍ | 451/700 [03:15<01:47, 2.31it/s, loss=0.0487, lr=0.0001] Steps: 64%|██████▍ | 451/700 [03:15<01:47, 2.31it/s, loss=0.0588, lr=0.0001] Steps: 65%|██████▍ | 452/700 [03:16<01:47, 2.31it/s, loss=0.0588, lr=0.0001] Steps: 65%|██████▍ | 452/700 [03:16<01:47, 2.31it/s, loss=0.0702, lr=0.0001] Steps: 65%|██████▍ | 453/700 [03:16<01:46, 2.31it/s, loss=0.0702, lr=0.0001] Steps: 65%|██████▍ | 453/700 [03:16<01:46, 2.31it/s, loss=0.0665, lr=0.0001] Steps: 65%|██████▍ | 454/700 [03:16<01:46, 2.31it/s, loss=0.0665, lr=0.0001] Steps: 65%|██████▍ | 454/700 [03:16<01:46, 2.31it/s, loss=0.189, lr=0.0001] Steps: 65%|██████▌ | 455/700 [03:17<01:46, 2.31it/s, loss=0.189, lr=0.0001] Steps: 65%|██████▌ | 455/700 [03:17<01:46, 2.31it/s, loss=0.105, lr=0.0001] Steps: 65%|██████▌ | 456/700 [03:17<01:45, 2.31it/s, loss=0.105, lr=0.0001] Steps: 65%|██████▌ | 456/700 [03:17<01:45, 2.31it/s, loss=0.114, lr=0.0001] Steps: 65%|██████▌ | 457/700 [03:18<01:45, 2.30it/s, loss=0.114, lr=0.0001] Steps: 65%|██████▌ | 457/700 [03:18<01:45, 2.30it/s, loss=0.0849, lr=0.0001] Steps: 65%|██████▌ | 458/700 [03:18<01:45, 2.30it/s, loss=0.0849, lr=0.0001] Steps: 65%|██████▌ | 458/700 [03:18<01:45, 2.30it/s, loss=0.084, lr=0.0001] Steps: 66%|██████▌ | 459/700 [03:19<01:44, 2.31it/s, loss=0.084, lr=0.0001] Steps: 66%|██████▌ | 459/700 [03:19<01:44, 2.31it/s, loss=0.165, lr=0.0001] Steps: 66%|██████▌ | 460/700 [03:19<01:43, 2.31it/s, loss=0.165, lr=0.0001] Steps: 66%|██████▌ | 460/700 [03:19<01:43, 2.31it/s, loss=0.0867, lr=0.0001] Steps: 66%|██████▌ | 461/700 [03:19<01:43, 2.31it/s, loss=0.0867, lr=0.0001] Steps: 66%|██████▌ | 461/700 [03:20<01:43, 2.31it/s, loss=0.0846, lr=0.0001] Steps: 66%|██████▌ | 462/700 [03:20<01:43, 2.31it/s, loss=0.0846, lr=0.0001] Steps: 66%|██████▌ | 462/700 [03:20<01:43, 2.31it/s, loss=0.107, lr=0.0001] Steps: 66%|██████▌ | 463/700 [03:20<01:42, 2.31it/s, loss=0.107, lr=0.0001] Steps: 66%|██████▌ | 463/700 [03:20<01:42, 2.31it/s, loss=0.0725, lr=0.0001] Steps: 66%|██████▋ | 464/700 [03:21<01:42, 2.31it/s, loss=0.0725, lr=0.0001] Steps: 66%|██████▋ | 464/700 [03:21<01:42, 2.31it/s, loss=0.0726, lr=0.0001] Steps: 66%|██████▋ | 465/700 [03:21<01:41, 2.31it/s, loss=0.0726, lr=0.0001] Steps: 66%|██████▋ | 465/700 [03:21<01:41, 2.31it/s, loss=0.106, lr=0.0001] Steps: 67%|██████▋ | 466/700 [03:22<01:41, 2.31it/s, loss=0.106, lr=0.0001] Steps: 67%|██████▋ | 466/700 [03:22<01:41, 2.31it/s, loss=0.134, lr=0.0001] Steps: 67%|██████▋ | 467/700 [03:22<01:40, 2.31it/s, loss=0.134, lr=0.0001] Steps: 67%|██████▋ | 467/700 [03:22<01:40, 2.31it/s, loss=0.0495, lr=0.0001] Steps: 67%|██████▋ | 468/700 [03:23<01:40, 2.31it/s, loss=0.0495, lr=0.0001] Steps: 67%|██████▋ | 468/700 [03:23<01:40, 2.31it/s, loss=0.0961, lr=0.0001] Steps: 67%|██████▋ | 469/700 [03:23<01:40, 2.30it/s, loss=0.0961, lr=0.0001] Steps: 67%|██████▋ | 469/700 [03:23<01:40, 2.30it/s, loss=0.0487, lr=0.0001] Steps: 67%|██████▋ | 470/700 [03:23<01:39, 2.30it/s, loss=0.0487, lr=0.0001] Steps: 67%|██████▋ | 470/700 [03:23<01:39, 2.30it/s, loss=0.0406, lr=0.0001] Steps: 67%|██████▋ | 471/700 [03:24<01:39, 2.31it/s, loss=0.0406, lr=0.0001] Steps: 67%|██████▋ | 471/700 [03:24<01:39, 2.31it/s, loss=0.044, lr=0.0001] Steps: 67%|██████▋ | 472/700 [03:24<01:38, 2.31it/s, loss=0.044, lr=0.0001] Steps: 67%|██████▋ | 472/700 [03:24<01:38, 2.31it/s, loss=0.149, lr=0.0001] Steps: 68%|██████▊ | 473/700 [03:25<01:38, 2.31it/s, loss=0.149, lr=0.0001] Steps: 68%|██████▊ | 473/700 [03:25<01:38, 2.31it/s, loss=0.0757, lr=0.0001] Steps: 68%|██████▊ | 474/700 [03:25<01:37, 2.31it/s, loss=0.0757, lr=0.0001] Steps: 68%|██████▊ | 474/700 [03:25<01:37, 2.31it/s, loss=0.0707, lr=0.0001] Steps: 68%|██████▊ | 475/700 [03:26<01:37, 2.31it/s, loss=0.0707, lr=0.0001] Steps: 68%|██████▊ | 475/700 [03:26<01:37, 2.31it/s, loss=0.0752, lr=0.0001] Steps: 68%|██████▊ | 476/700 [03:26<01:36, 2.31it/s, loss=0.0752, lr=0.0001] Steps: 68%|██████▊ | 476/700 [03:26<01:36, 2.31it/s, loss=0.0597, lr=0.0001] Steps: 68%|██████▊ | 477/700 [03:26<01:36, 2.31it/s, loss=0.0597, lr=0.0001] Steps: 68%|██████▊ | 477/700 [03:26<01:36, 2.31it/s, loss=0.078, lr=0.0001] Steps: 68%|██████▊ | 478/700 [03:27<01:36, 2.31it/s, loss=0.078, lr=0.0001] Steps: 68%|██████▊ | 478/700 [03:27<01:36, 2.31it/s, loss=0.109, lr=0.0001] Steps: 68%|██████▊ | 479/700 [03:27<01:35, 2.31it/s, loss=0.109, lr=0.0001] Steps: 68%|██████▊ | 479/700 [03:27<01:35, 2.31it/s, loss=0.0575, lr=0.0001] Steps: 69%|██████▊ | 480/700 [03:28<01:35, 2.31it/s, loss=0.0575, lr=0.0001] Steps: 69%|██████▊ | 480/700 [03:28<01:35, 2.31it/s, loss=0.099, lr=0.0001] Steps: 69%|██████▊ | 481/700 [03:28<01:35, 2.30it/s, loss=0.099, lr=0.0001] Steps: 69%|██████▊ | 481/700 [03:28<01:35, 2.30it/s, loss=0.119, lr=0.0001] Steps: 69%|██████▉ | 482/700 [03:29<01:34, 2.30it/s, loss=0.119, lr=0.0001] Steps: 69%|██████▉ | 482/700 [03:29<01:34, 2.30it/s, loss=0.246, lr=0.0001] Steps: 69%|██████▉ | 483/700 [03:29<01:34, 2.31it/s, loss=0.246, lr=0.0001] Steps: 69%|██████▉ | 483/700 [03:29<01:34, 2.31it/s, loss=0.0938, lr=0.0001] Steps: 69%|██████▉ | 484/700 [03:29<01:33, 2.31it/s, loss=0.0938, lr=0.0001] Steps: 69%|██████▉ | 484/700 [03:29<01:33, 2.31it/s, loss=0.0895, lr=0.0001] Steps: 69%|██████▉ | 485/700 [03:30<01:33, 2.31it/s, loss=0.0895, lr=0.0001] Steps: 69%|██████▉ | 485/700 [03:30<01:33, 2.31it/s, loss=0.146, lr=0.0001] Steps: 69%|██████▉ | 486/700 [03:30<01:32, 2.31it/s, loss=0.146, lr=0.0001] Steps: 69%|██████▉ | 486/700 [03:30<01:32, 2.31it/s, loss=0.0565, lr=0.0001] Steps: 70%|██████▉ | 487/700 [03:31<01:32, 2.31it/s, loss=0.0565, lr=0.0001] Steps: 70%|██████▉ | 487/700 [03:31<01:32, 2.31it/s, loss=0.142, lr=0.0001] Steps: 70%|██████▉ | 488/700 [03:31<01:31, 2.31it/s, loss=0.142, lr=0.0001] Steps: 70%|██████▉ | 488/700 [03:31<01:31, 2.31it/s, loss=0.0218, lr=0.0001] Steps: 70%|██████▉ | 489/700 [03:32<01:31, 2.31it/s, loss=0.0218, lr=0.0001] Steps: 70%|██████▉ | 489/700 [03:32<01:31, 2.31it/s, loss=0.0811, lr=0.0001] Steps: 70%|███████ | 490/700 [03:32<01:30, 2.31it/s, loss=0.0811, lr=0.0001] Steps: 70%|███████ | 490/700 [03:32<01:30, 2.31it/s, loss=0.0571, lr=0.0001] Steps: 70%|███████ | 491/700 [03:32<01:30, 2.31it/s, loss=0.0571, lr=0.0001] Steps: 70%|███████ | 491/700 [03:33<01:30, 2.31it/s, loss=0.109, lr=0.0001] Steps: 70%|███████ | 492/700 [03:33<01:29, 2.31it/s, loss=0.109, lr=0.0001] Steps: 70%|███████ | 492/700 [03:33<01:29, 2.31it/s, loss=0.136, lr=0.0001] Steps: 70%|███████ | 493/700 [03:33<01:29, 2.30it/s, loss=0.136, lr=0.0001] Steps: 70%|███████ | 493/700 [03:33<01:29, 2.30it/s, loss=0.233, lr=0.0001] Steps: 71%|███████ | 494/700 [03:34<01:29, 2.30it/s, loss=0.233, lr=0.0001] Steps: 71%|███████ | 494/700 [03:34<01:29, 2.30it/s, loss=0.0985, lr=0.0001] Steps: 71%|███████ | 495/700 [03:34<01:28, 2.31it/s, loss=0.0985, lr=0.0001] Steps: 71%|███████ | 495/700 [03:34<01:28, 2.31it/s, loss=0.0914, lr=0.0001] Steps: 71%|███████ | 496/700 [03:35<01:28, 2.31it/s, loss=0.0914, lr=0.0001] Steps: 71%|███████ | 496/700 [03:35<01:28, 2.31it/s, loss=0.126, lr=0.0001] Steps: 71%|███████ | 497/700 [03:35<01:27, 2.31it/s, loss=0.126, lr=0.0001] Steps: 71%|███████ | 497/700 [03:35<01:27, 2.31it/s, loss=0.112, lr=0.0001] Steps: 71%|███████ | 498/700 [03:36<01:27, 2.31it/s, loss=0.112, lr=0.0001] Steps: 71%|███████ | 498/700 [03:36<01:27, 2.31it/s, loss=0.0553, lr=0.0001] Steps: 71%|███████▏ | 499/700 [03:36<01:27, 2.31it/s, loss=0.0553, lr=0.0001] Steps: 71%|███████▏ | 499/700 [03:36<01:27, 2.31it/s, loss=0.142, lr=0.0001] Steps: 71%|███████▏ | 500/700 [03:36<01:26, 2.31it/s, loss=0.142, lr=0.0001] Steps: 71%|███████▏ | 500/700 [03:36<01:26, 2.31it/s, loss=0.129, lr=0.0001] Steps: 72%|███████▏ | 501/700 [03:37<01:26, 2.31it/s, loss=0.129, lr=0.0001] Steps: 72%|███████▏ | 501/700 [03:37<01:26, 2.31it/s, loss=0.0563, lr=0.0001] Steps: 72%|███████▏ | 502/700 [03:37<01:25, 2.31it/s, loss=0.0563, lr=0.0001] Steps: 72%|███████▏ | 502/700 [03:37<01:25, 2.31it/s, loss=0.234, lr=0.0001] Steps: 72%|███████▏ | 503/700 [03:38<01:25, 2.31it/s, loss=0.234, lr=0.0001] Steps: 72%|███████▏ | 503/700 [03:38<01:25, 2.31it/s, loss=0.103, lr=0.0001] Steps: 72%|███████▏ | 504/700 [03:38<01:24, 2.31it/s, loss=0.103, lr=0.0001] Steps: 72%|███████▏ | 504/700 [03:38<01:24, 2.31it/s, loss=0.0646, lr=0.0001] Steps: 72%|███████▏ | 505/700 [03:39<01:24, 2.30it/s, loss=0.0646, lr=0.0001] Steps: 72%|███████▏ | 505/700 [03:39<01:24, 2.30it/s, loss=0.101, lr=0.0001] Steps: 72%|███████▏ | 506/700 [03:39<01:24, 2.30it/s, loss=0.101, lr=0.0001] Steps: 72%|███████▏ | 506/700 [03:39<01:24, 2.30it/s, loss=0.0391, lr=0.0001] Steps: 72%|███████▏ | 507/700 [03:39<01:23, 2.31it/s, loss=0.0391, lr=0.0001] Steps: 72%|███████▏ | 507/700 [03:39<01:23, 2.31it/s, loss=0.0464, lr=0.0001] Steps: 73%|███████▎ | 508/700 [03:40<01:23, 2.31it/s, loss=0.0464, lr=0.0001] Steps: 73%|███████▎ | 508/700 [03:40<01:23, 2.31it/s, loss=0.0821, lr=0.0001] Steps: 73%|███████▎ | 509/700 [03:40<01:22, 2.31it/s, loss=0.0821, lr=0.0001] Steps: 73%|███████▎ | 509/700 [03:40<01:22, 2.31it/s, loss=0.191, lr=0.0001] Steps: 73%|███████▎ | 510/700 [03:41<01:22, 2.31it/s, loss=0.191, lr=0.0001] Steps: 73%|███████▎ | 510/700 [03:41<01:22, 2.31it/s, loss=0.0574, lr=0.0001] Steps: 73%|███████▎ | 511/700 [03:41<01:21, 2.31it/s, loss=0.0574, lr=0.0001] Steps: 73%|███████▎ | 511/700 [03:41<01:21, 2.31it/s, loss=0.0778, lr=0.0001] Steps: 73%|███████▎ | 512/700 [03:42<01:21, 2.31it/s, loss=0.0778, lr=0.0001] Steps: 73%|███████▎ | 512/700 [03:42<01:21, 2.31it/s, loss=0.179, lr=0.0001] Steps: 73%|███████▎ | 513/700 [03:42<01:20, 2.31it/s, loss=0.179, lr=0.0001] Steps: 73%|███████▎ | 513/700 [03:42<01:20, 2.31it/s, loss=0.0893, lr=0.0001] Steps: 73%|███████▎ | 514/700 [03:42<01:20, 2.31it/s, loss=0.0893, lr=0.0001] Steps: 73%|███████▎ | 514/700 [03:42<01:20, 2.31it/s, loss=0.0585, lr=0.0001] Steps: 74%|███████▎ | 515/700 [03:43<01:20, 2.31it/s, loss=0.0585, lr=0.0001] Steps: 74%|███████▎ | 515/700 [03:43<01:20, 2.31it/s, loss=0.0622, lr=0.0001] Steps: 74%|███████▎ | 516/700 [03:43<01:19, 2.31it/s, loss=0.0622, lr=0.0001] Steps: 74%|███████▎ | 516/700 [03:43<01:19, 2.31it/s, loss=0.0993, lr=0.0001] Steps: 74%|███████▍ | 517/700 [03:44<01:19, 2.30it/s, loss=0.0993, lr=0.0001] Steps: 74%|███████▍ | 517/700 [03:44<01:19, 2.30it/s, loss=0.0807, lr=0.0001] Steps: 74%|███████▍ | 518/700 [03:44<01:18, 2.31it/s, loss=0.0807, lr=0.0001] Steps: 74%|███████▍ | 518/700 [03:44<01:18, 2.31it/s, loss=0.1, lr=0.0001] Steps: 74%|███████▍ | 519/700 [03:45<01:18, 2.31it/s, loss=0.1, lr=0.0001] Steps: 74%|███████▍ | 519/700 [03:45<01:18, 2.31it/s, loss=0.0567, lr=0.0001] Steps: 74%|███████▍ | 520/700 [03:45<01:18, 2.31it/s, loss=0.0567, lr=0.0001] Steps: 74%|███████▍ | 520/700 [03:45<01:18, 2.31it/s, loss=0.163, lr=0.0001] Steps: 74%|███████▍ | 521/700 [03:45<01:17, 2.31it/s, loss=0.163, lr=0.0001] Steps: 74%|███████▍ | 521/700 [03:45<01:17, 2.31it/s, loss=0.146, lr=0.0001] Steps: 75%|███████▍ | 522/700 [03:46<01:17, 2.31it/s, loss=0.146, lr=0.0001] Steps: 75%|███████▍ | 522/700 [03:46<01:17, 2.31it/s, loss=0.12, lr=0.0001] Steps: 75%|███████▍ | 523/700 [03:46<01:16, 2.31it/s, loss=0.12, lr=0.0001] Steps: 75%|███████▍ | 523/700 [03:46<01:16, 2.31it/s, loss=0.181, lr=0.0001] Steps: 75%|███████▍ | 524/700 [03:47<01:16, 2.31it/s, loss=0.181, lr=0.0001] Steps: 75%|███████▍ | 524/700 [03:47<01:16, 2.31it/s, loss=0.147, lr=0.0001] Steps: 75%|███████▌ | 525/700 [03:47<01:15, 2.31it/s, loss=0.147, lr=0.0001] Steps: 75%|███████▌ | 525/700 [03:47<01:15, 2.31it/s, loss=0.0812, lr=0.0001] Steps: 75%|███████▌ | 526/700 [03:48<01:15, 2.31it/s, loss=0.0812, lr=0.0001] Steps: 75%|███████▌ | 526/700 [03:48<01:15, 2.31it/s, loss=0.0597, lr=0.0001] Steps: 75%|███████▌ | 527/700 [03:48<01:14, 2.31it/s, loss=0.0597, lr=0.0001] Steps: 75%|███████▌ | 527/700 [03:48<01:14, 2.31it/s, loss=0.0656, lr=0.0001] Steps: 75%|███████▌ | 528/700 [03:48<01:14, 2.31it/s, loss=0.0656, lr=0.0001] Steps: 75%|███████▌ | 528/700 [03:49<01:14, 2.31it/s, loss=0.114, lr=0.0001] Steps: 76%|███████▌ | 529/700 [03:49<01:14, 2.30it/s, loss=0.114, lr=0.0001] Steps: 76%|███████▌ | 529/700 [03:49<01:14, 2.30it/s, loss=0.0865, lr=0.0001] Steps: 76%|███████▌ | 530/700 [03:49<01:13, 2.30it/s, loss=0.0865, lr=0.0001] Steps: 76%|███████▌ | 530/700 [03:49<01:13, 2.30it/s, loss=0.0999, lr=0.0001] Steps: 76%|███████▌ | 531/700 [03:50<01:13, 2.31it/s, loss=0.0999, lr=0.0001] Steps: 76%|███████▌ | 531/700 [03:50<01:13, 2.31it/s, loss=0.142, lr=0.0001] Steps: 76%|███████▌ | 532/700 [03:50<01:12, 2.31it/s, loss=0.142, lr=0.0001] Steps: 76%|███████▌ | 532/700 [03:50<01:12, 2.31it/s, loss=0.0418, lr=0.0001] Steps: 76%|███████▌ | 533/700 [03:51<01:12, 2.31it/s, loss=0.0418, lr=0.0001] Steps: 76%|███████▌ | 533/700 [03:51<01:12, 2.31it/s, loss=0.0675, lr=0.0001] Steps: 76%|███████▋ | 534/700 [03:51<01:11, 2.31it/s, loss=0.0675, lr=0.0001] Steps: 76%|███████▋ | 534/700 [03:51<01:11, 2.31it/s, loss=0.051, lr=0.0001] Steps: 76%|███████▋ | 535/700 [03:52<01:11, 2.31it/s, loss=0.051, lr=0.0001] Steps: 76%|███████▋ | 535/700 [03:52<01:11, 2.31it/s, loss=0.131, lr=0.0001] Steps: 77%|███████▋ | 536/700 [03:52<01:11, 2.31it/s, loss=0.131, lr=0.0001] Steps: 77%|███████▋ | 536/700 [03:52<01:11, 2.31it/s, loss=0.0786, lr=0.0001] Steps: 77%|███████▋ | 537/700 [03:52<01:10, 2.31it/s, loss=0.0786, lr=0.0001] Steps: 77%|███████▋ | 537/700 [03:52<01:10, 2.31it/s, loss=0.122, lr=0.0001] Steps: 77%|███████▋ | 538/700 [03:53<01:10, 2.31it/s, loss=0.122, lr=0.0001] Steps: 77%|███████▋ | 538/700 [03:53<01:10, 2.31it/s, loss=0.0734, lr=0.0001] Steps: 77%|███████▋ | 539/700 [03:53<01:09, 2.31it/s, loss=0.0734, lr=0.0001] Steps: 77%|███████▋ | 539/700 [03:53<01:09, 2.31it/s, loss=0.0796, lr=0.0001] Steps: 77%|███████▋ | 540/700 [03:54<01:09, 2.31it/s, loss=0.0796, lr=0.0001] Steps: 77%|███████▋ | 540/700 [03:54<01:09, 2.31it/s, loss=0.0497, lr=0.0001] Steps: 77%|███████▋ | 541/700 [03:54<01:09, 2.30it/s, loss=0.0497, lr=0.0001] Steps: 77%|███████▋ | 541/700 [03:54<01:09, 2.30it/s, loss=0.14, lr=0.0001] Steps: 77%|███████▋ | 542/700 [03:55<01:08, 2.30it/s, loss=0.14, lr=0.0001] Steps: 77%|███████▋ | 542/700 [03:55<01:08, 2.30it/s, loss=0.059, lr=0.0001] Steps: 78%|███████▊ | 543/700 [03:55<01:08, 2.30it/s, loss=0.059, lr=0.0001] Steps: 78%|███████▊ | 543/700 [03:55<01:08, 2.30it/s, loss=0.0413, lr=0.0001] Steps: 78%|███████▊ | 544/700 [03:55<01:07, 2.30it/s, loss=0.0413, lr=0.0001] Steps: 78%|███████▊ | 544/700 [03:55<01:07, 2.30it/s, loss=0.0563, lr=0.0001] Steps: 78%|███████▊ | 545/700 [03:56<01:07, 2.31it/s, loss=0.0563, lr=0.0001] Steps: 78%|███████▊ | 545/700 [03:56<01:07, 2.31it/s, loss=0.0928, lr=0.0001] Steps: 78%|███████▊ | 546/700 [03:56<01:06, 2.31it/s, loss=0.0928, lr=0.0001] Steps: 78%|███████▊ | 546/700 [03:56<01:06, 2.31it/s, loss=0.121, lr=0.0001] Steps: 78%|███████▊ | 547/700 [03:57<01:06, 2.31it/s, loss=0.121, lr=0.0001] Steps: 78%|███████▊ | 547/700 [03:57<01:06, 2.31it/s, loss=0.107, lr=0.0001] Steps: 78%|███████▊ | 548/700 [03:57<01:05, 2.31it/s, loss=0.107, lr=0.0001] Steps: 78%|███████▊ | 548/700 [03:57<01:05, 2.31it/s, loss=0.11, lr=0.0001] Steps: 78%|███████▊ | 549/700 [03:58<01:05, 2.31it/s, loss=0.11, lr=0.0001] Steps: 78%|███████▊ | 549/700 [03:58<01:05, 2.31it/s, loss=0.0758, lr=0.0001] Steps: 79%|███████▊ | 550/700 [03:58<01:04, 2.31it/s, loss=0.0758, lr=0.0001] Steps: 79%|███████▊ | 550/700 [03:58<01:04, 2.31it/s, loss=0.0922, lr=0.0001] Steps: 79%|███████▊ | 551/700 [03:58<01:04, 2.31it/s, loss=0.0922, lr=0.0001] Steps: 79%|███████▊ | 551/700 [03:58<01:04, 2.31it/s, loss=0.0692, lr=0.0001] Steps: 79%|███████▉ | 552/700 [03:59<01:04, 2.31it/s, loss=0.0692, lr=0.0001] Steps: 79%|███████▉ | 552/700 [03:59<01:04, 2.31it/s, loss=0.0917, lr=0.0001] Steps: 79%|███████▉ | 553/700 [03:59<01:03, 2.30it/s, loss=0.0917, lr=0.0001] Steps: 79%|███████▉ | 553/700 [03:59<01:03, 2.30it/s, loss=0.0807, lr=0.0001] Steps: 79%|███████▉ | 554/700 [04:00<01:03, 2.30it/s, loss=0.0807, lr=0.0001] Steps: 79%|███████▉ | 554/700 [04:00<01:03, 2.30it/s, loss=0.0807, lr=0.0001] Steps: 79%|███████▉ | 555/700 [04:00<01:02, 2.31it/s, loss=0.0807, lr=0.0001] Steps: 79%|███████▉ | 555/700 [04:00<01:02, 2.31it/s, loss=0.121, lr=0.0001] Steps: 79%|███████▉ | 556/700 [04:01<01:02, 2.31it/s, loss=0.121, lr=0.0001] Steps: 79%|███████▉ | 556/700 [04:01<01:02, 2.31it/s, loss=0.0876, lr=0.0001] Steps: 80%|███████▉ | 557/700 [04:01<01:01, 2.31it/s, loss=0.0876, lr=0.0001] Steps: 80%|███████▉ | 557/700 [04:01<01:01, 2.31it/s, loss=0.114, lr=0.0001] Steps: 80%|███████▉ | 558/700 [04:01<01:01, 2.31it/s, loss=0.114, lr=0.0001] Steps: 80%|███████▉ | 558/700 [04:02<01:01, 2.31it/s, loss=0.0979, lr=0.0001] Steps: 80%|███████▉ | 559/700 [04:02<01:01, 2.31it/s, loss=0.0979, lr=0.0001] Steps: 80%|███████▉ | 559/700 [04:02<01:01, 2.31it/s, loss=0.0651, lr=0.0001] Steps: 80%|████████ | 560/700 [04:02<01:00, 2.31it/s, loss=0.0651, lr=0.0001] Steps: 80%|████████ | 560/700 [04:02<01:00, 2.31it/s, loss=0.064, lr=0.0001] Steps: 80%|████████ | 561/700 [04:03<01:00, 2.31it/s, loss=0.064, lr=0.0001] Steps: 80%|████████ | 561/700 [04:03<01:00, 2.31it/s, loss=0.137, lr=0.0001] Steps: 80%|████████ | 562/700 [04:03<00:59, 2.31it/s, loss=0.137, lr=0.0001] Steps: 80%|████████ | 562/700 [04:03<00:59, 2.31it/s, loss=0.0395, lr=0.0001] Steps: 80%|████████ | 563/700 [04:04<00:59, 2.31it/s, loss=0.0395, lr=0.0001] Steps: 80%|████████ | 563/700 [04:04<00:59, 2.31it/s, loss=0.107, lr=0.0001] Steps: 81%|████████ | 564/700 [04:04<00:58, 2.31it/s, loss=0.107, lr=0.0001] Steps: 81%|████████ | 564/700 [04:04<00:58, 2.31it/s, loss=0.0575, lr=0.0001] Steps: 81%|████████ | 565/700 [04:05<00:58, 2.29it/s, loss=0.0575, lr=0.0001] Steps: 81%|████████ | 565/700 [04:05<00:58, 2.29it/s, loss=0.0622, lr=0.0001] Steps: 81%|████████ | 566/700 [04:05<00:58, 2.30it/s, loss=0.0622, lr=0.0001] Steps: 81%|████████ | 566/700 [04:05<00:58, 2.30it/s, loss=0.0854, lr=0.0001] Steps: 81%|████████ | 567/700 [04:05<00:57, 2.31it/s, loss=0.0854, lr=0.0001] Steps: 81%|████████ | 567/700 [04:05<00:57, 2.31it/s, loss=0.0195, lr=0.0001] Steps: 81%|████████ | 568/700 [04:06<00:57, 2.31it/s, loss=0.0195, lr=0.0001] Steps: 81%|████████ | 568/700 [04:06<00:57, 2.31it/s, loss=0.105, lr=0.0001] Steps: 81%|████████▏ | 569/700 [04:06<00:56, 2.31it/s, loss=0.105, lr=0.0001] Steps: 81%|████████▏ | 569/700 [04:06<00:56, 2.31it/s, loss=0.11, lr=0.0001] Steps: 81%|████████▏ | 570/700 [04:07<00:56, 2.31it/s, loss=0.11, lr=0.0001] Steps: 81%|████████▏ | 570/700 [04:07<00:56, 2.31it/s, loss=0.0211, lr=0.0001] Steps: 82%|████████▏ | 571/700 [04:07<00:55, 2.31it/s, loss=0.0211, lr=0.0001] Steps: 82%|████████▏ | 571/700 [04:07<00:55, 2.31it/s, loss=0.0886, lr=0.0001] Steps: 82%|████████▏ | 572/700 [04:08<00:55, 2.31it/s, loss=0.0886, lr=0.0001] Steps: 82%|████████▏ | 572/700 [04:08<00:55, 2.31it/s, loss=0.103, lr=0.0001] Steps: 82%|████████▏ | 573/700 [04:08<00:54, 2.31it/s, loss=0.103, lr=0.0001] Steps: 82%|████████▏ | 573/700 [04:08<00:54, 2.31it/s, loss=0.0681, lr=0.0001] Steps: 82%|████████▏ | 574/700 [04:08<00:54, 2.31it/s, loss=0.0681, lr=0.0001] Steps: 82%|████████▏ | 574/700 [04:08<00:54, 2.31it/s, loss=0.0704, lr=0.0001] Steps: 82%|████████▏ | 575/700 [04:09<00:54, 2.31it/s, loss=0.0704, lr=0.0001] Steps: 82%|████████▏ | 575/700 [04:09<00:54, 2.31it/s, loss=0.044, lr=0.0001] Steps: 82%|████████▏ | 576/700 [04:09<00:53, 2.31it/s, loss=0.044, lr=0.0001] Steps: 82%|████████▏ | 576/700 [04:09<00:53, 2.31it/s, loss=0.0852, lr=0.0001] Steps: 82%|████████▏ | 577/700 [04:10<00:53, 2.30it/s, loss=0.0852, lr=0.0001] Steps: 82%|████████▏ | 577/700 [04:10<00:53, 2.30it/s, loss=0.176, lr=0.0001] Steps: 83%|████████▎ | 578/700 [04:10<00:52, 2.30it/s, loss=0.176, lr=0.0001] Steps: 83%|████████▎ | 578/700 [04:10<00:52, 2.30it/s, loss=0.0449, lr=0.0001] Steps: 83%|████████▎ | 579/700 [04:11<00:52, 2.30it/s, loss=0.0449, lr=0.0001] Steps: 83%|████████▎ | 579/700 [04:11<00:52, 2.30it/s, loss=0.0719, lr=0.0001] Steps: 83%|████████▎ | 580/700 [04:11<00:52, 2.31it/s, loss=0.0719, lr=0.0001] Steps: 83%|████████▎ | 580/700 [04:11<00:52, 2.31it/s, loss=0.0621, lr=0.0001] Steps: 83%|████████▎ | 581/700 [04:11<00:51, 2.31it/s, loss=0.0621, lr=0.0001] Steps: 83%|████████▎ | 581/700 [04:11<00:51, 2.31it/s, loss=0.106, lr=0.0001] Steps: 83%|████████▎ | 582/700 [04:12<00:51, 2.31it/s, loss=0.106, lr=0.0001] Steps: 83%|████████▎ | 582/700 [04:12<00:51, 2.31it/s, loss=0.057, lr=0.0001] Steps: 83%|████████▎ | 583/700 [04:12<00:50, 2.31it/s, loss=0.057, lr=0.0001] Steps: 83%|████████▎ | 583/700 [04:12<00:50, 2.31it/s, loss=0.0693, lr=0.0001] Steps: 83%|████████▎ | 584/700 [04:13<00:50, 2.31it/s, loss=0.0693, lr=0.0001] Steps: 83%|████████▎ | 584/700 [04:13<00:50, 2.31it/s, loss=0.0972, lr=0.0001] Steps: 84%|████████▎ | 585/700 [04:13<00:49, 2.31it/s, loss=0.0972, lr=0.0001] Steps: 84%|████████▎ | 585/700 [04:13<00:49, 2.31it/s, loss=0.0737, lr=0.0001] Steps: 84%|████████▎ | 586/700 [04:14<00:49, 2.31it/s, loss=0.0737, lr=0.0001] Steps: 84%|████████▎ | 586/700 [04:14<00:49, 2.31it/s, loss=0.099, lr=0.0001] Steps: 84%|████████▍ | 587/700 [04:14<00:48, 2.31it/s, loss=0.099, lr=0.0001] Steps: 84%|████████▍ | 587/700 [04:14<00:48, 2.31it/s, loss=0.0862, lr=0.0001] Steps: 84%|████████▍ | 588/700 [04:14<00:48, 2.31it/s, loss=0.0862, lr=0.0001] Steps: 84%|████████▍ | 588/700 [04:15<00:48, 2.31it/s, loss=0.0784, lr=0.0001] Steps: 84%|████████▍ | 589/700 [04:15<00:48, 2.30it/s, loss=0.0784, lr=0.0001] Steps: 84%|████████▍ | 589/700 [04:15<00:48, 2.30it/s, loss=0.0483, lr=0.0001] Steps: 84%|████████▍ | 590/700 [04:15<00:47, 2.30it/s, loss=0.0483, lr=0.0001] Steps: 84%|████████▍ | 590/700 [04:15<00:47, 2.30it/s, loss=0.051, lr=0.0001] Steps: 84%|████████▍ | 591/700 [04:16<00:47, 2.31it/s, loss=0.051, lr=0.0001] Steps: 84%|████████▍ | 591/700 [04:16<00:47, 2.31it/s, loss=0.171, lr=0.0001] Steps: 85%|████████▍ | 592/700 [04:16<00:46, 2.31it/s, loss=0.171, lr=0.0001] Steps: 85%|████████▍ | 592/700 [04:16<00:46, 2.31it/s, loss=0.0579, lr=0.0001] Steps: 85%|████████▍ | 593/700 [04:17<00:46, 2.31it/s, loss=0.0579, lr=0.0001] Steps: 85%|████████▍ | 593/700 [04:17<00:46, 2.31it/s, loss=0.174, lr=0.0001] Steps: 85%|████████▍ | 594/700 [04:17<00:45, 2.31it/s, loss=0.174, lr=0.0001] Steps: 85%|████████▍ | 594/700 [04:17<00:45, 2.31it/s, loss=0.0611, lr=0.0001] Steps: 85%|████████▌ | 595/700 [04:18<00:45, 2.31it/s, loss=0.0611, lr=0.0001] Steps: 85%|████████▌ | 595/700 [04:18<00:45, 2.31it/s, loss=0.0749, lr=0.0001] Steps: 85%|████████▌ | 596/700 [04:18<00:45, 2.31it/s, loss=0.0749, lr=0.0001] Steps: 85%|████████▌ | 596/700 [04:18<00:45, 2.31it/s, loss=0.108, lr=0.0001] Steps: 85%|████████▌ | 597/700 [04:18<00:44, 2.31it/s, loss=0.108, lr=0.0001] Steps: 85%|████████▌ | 597/700 [04:18<00:44, 2.31it/s, loss=0.0268, lr=0.0001] Steps: 85%|████████▌ | 598/700 [04:19<00:44, 2.31it/s, loss=0.0268, lr=0.0001] Steps: 85%|████████▌ | 598/700 [04:19<00:44, 2.31it/s, loss=0.11, lr=0.0001] Steps: 86%|████████▌ | 599/700 [04:19<00:43, 2.31it/s, loss=0.11, lr=0.0001] Steps: 86%|████████▌ | 599/700 [04:19<00:43, 2.31it/s, loss=0.122, lr=0.0001] Steps: 86%|████████▌ | 600/700 [04:20<00:43, 2.31it/s, loss=0.122, lr=0.0001] Steps: 86%|████████▌ | 600/700 [04:20<00:43, 2.31it/s, loss=0.129, lr=0.0001] Steps: 86%|████████▌ | 601/700 [04:20<00:43, 2.30it/s, loss=0.129, lr=0.0001] Steps: 86%|████████▌ | 601/700 [04:20<00:43, 2.30it/s, loss=0.0724, lr=0.0001] Steps: 86%|████████▌ | 602/700 [04:21<00:42, 2.30it/s, loss=0.0724, lr=0.0001] Steps: 86%|████████▌ | 602/700 [04:21<00:42, 2.30it/s, loss=0.0995, lr=0.0001] Steps: 86%|████████▌ | 603/700 [04:21<00:42, 2.30it/s, loss=0.0995, lr=0.0001] Steps: 86%|████████▌ | 603/700 [04:21<00:42, 2.30it/s, loss=0.138, lr=0.0001] Steps: 86%|████████▋ | 604/700 [04:21<00:41, 2.31it/s, loss=0.138, lr=0.0001] Steps: 86%|████████▋ | 604/700 [04:21<00:41, 2.31it/s, loss=0.173, lr=0.0001] Steps: 86%|████████▋ | 605/700 [04:22<00:41, 2.31it/s, loss=0.173, lr=0.0001] Steps: 86%|████████▋ | 605/700 [04:22<00:41, 2.31it/s, loss=0.0835, lr=0.0001] Steps: 87%|████████▋ | 606/700 [04:22<00:40, 2.31it/s, loss=0.0835, lr=0.0001] Steps: 87%|████████▋ | 606/700 [04:22<00:40, 2.31it/s, loss=0.0355, lr=0.0001] Steps: 87%|████████▋ | 607/700 [04:23<00:40, 2.31it/s, loss=0.0355, lr=0.0001] Steps: 87%|████████▋ | 607/700 [04:23<00:40, 2.31it/s, loss=0.08, lr=0.0001] Steps: 87%|████████▋ | 608/700 [04:23<00:39, 2.31it/s, loss=0.08, lr=0.0001] Steps: 87%|████████▋ | 608/700 [04:23<00:39, 2.31it/s, loss=0.0755, lr=0.0001] Steps: 87%|████████▋ | 609/700 [04:24<00:39, 2.31it/s, loss=0.0755, lr=0.0001] Steps: 87%|████████▋ | 609/700 [04:24<00:39, 2.31it/s, loss=0.0997, lr=0.0001] Steps: 87%|████████▋ | 610/700 [04:24<00:39, 2.31it/s, loss=0.0997, lr=0.0001] Steps: 87%|████████▋ | 610/700 [04:24<00:39, 2.31it/s, loss=0.0443, lr=0.0001] Steps: 87%|████████▋ | 611/700 [04:24<00:38, 2.31it/s, loss=0.0443, lr=0.0001] Steps: 87%|████████▋ | 611/700 [04:24<00:38, 2.31it/s, loss=0.0704, lr=0.0001] Steps: 87%|████████▋ | 612/700 [04:25<00:38, 2.31it/s, loss=0.0704, lr=0.0001] Steps: 87%|████████▋ | 612/700 [04:25<00:38, 2.31it/s, loss=0.175, lr=0.0001] Steps: 88%|████████▊ | 613/700 [04:25<00:37, 2.30it/s, loss=0.175, lr=0.0001] Steps: 88%|████████▊ | 613/700 [04:25<00:37, 2.30it/s, loss=0.0591, lr=0.0001] Steps: 88%|████████▊ | 614/700 [04:26<00:37, 2.30it/s, loss=0.0591, lr=0.0001] Steps: 88%|████████▊ | 614/700 [04:26<00:37, 2.30it/s, loss=0.0502, lr=0.0001] Steps: 88%|████████▊ | 615/700 [04:26<00:36, 2.31it/s, loss=0.0502, lr=0.0001] Steps: 88%|████████▊ | 615/700 [04:26<00:36, 2.31it/s, loss=0.0879, lr=0.0001] Steps: 88%|████████▊ | 616/700 [04:27<00:36, 2.31it/s, loss=0.0879, lr=0.0001] Steps: 88%|████████▊ | 616/700 [04:27<00:36, 2.31it/s, loss=0.134, lr=0.0001] Steps: 88%|████████▊ | 617/700 [04:27<00:35, 2.31it/s, loss=0.134, lr=0.0001] Steps: 88%|████████▊ | 617/700 [04:27<00:35, 2.31it/s, loss=0.0696, lr=0.0001] Steps: 88%|████████▊ | 618/700 [04:27<00:35, 2.31it/s, loss=0.0696, lr=0.0001] Steps: 88%|████████▊ | 618/700 [04:28<00:35, 2.31it/s, loss=0.0538, lr=0.0001] Steps: 88%|████████▊ | 619/700 [04:28<00:35, 2.31it/s, loss=0.0538, lr=0.0001] Steps: 88%|████████▊ | 619/700 [04:28<00:35, 2.31it/s, loss=0.112, lr=0.0001] Steps: 89%|████████▊ | 620/700 [04:28<00:34, 2.31it/s, loss=0.112, lr=0.0001] Steps: 89%|████████▊ | 620/700 [04:28<00:34, 2.31it/s, loss=0.0917, lr=0.0001] Steps: 89%|████████▊ | 621/700 [04:29<00:34, 2.31it/s, loss=0.0917, lr=0.0001] Steps: 89%|████████▊ | 621/700 [04:29<00:34, 2.31it/s, loss=0.114, lr=0.0001] Steps: 89%|████████▉ | 622/700 [04:29<00:33, 2.31it/s, loss=0.114, lr=0.0001] Steps: 89%|████████▉ | 622/700 [04:29<00:33, 2.31it/s, loss=0.0821, lr=0.0001] Steps: 89%|████████▉ | 623/700 [04:30<00:33, 2.31it/s, loss=0.0821, lr=0.0001] Steps: 89%|████████▉ | 623/700 [04:30<00:33, 2.31it/s, loss=0.0445, lr=0.0001] Steps: 89%|████████▉ | 624/700 [04:30<00:32, 2.32it/s, loss=0.0445, lr=0.0001] Steps: 89%|████████▉ | 624/700 [04:30<00:32, 2.32it/s, loss=0.0757, lr=0.0001] Steps: 89%|████████▉ | 625/700 [04:31<00:32, 2.30it/s, loss=0.0757, lr=0.0001] Steps: 89%|████████▉ | 625/700 [04:31<00:32, 2.30it/s, loss=0.0623, lr=0.0001] Steps: 89%|████████▉ | 626/700 [04:31<00:32, 2.31it/s, loss=0.0623, lr=0.0001] Steps: 89%|████████▉ | 626/700 [04:31<00:32, 2.31it/s, loss=0.129, lr=0.0001] Steps: 90%|████████▉ | 627/700 [04:31<00:31, 2.31it/s, loss=0.129, lr=0.0001] Steps: 90%|████████▉ | 627/700 [04:31<00:31, 2.31it/s, loss=0.0883, lr=0.0001] Steps: 90%|████████▉ | 628/700 [04:32<00:31, 2.31it/s, loss=0.0883, lr=0.0001] Steps: 90%|████████▉ | 628/700 [04:32<00:31, 2.31it/s, loss=0.0954, lr=0.0001] Steps: 90%|████████▉ | 629/700 [04:32<00:30, 2.31it/s, loss=0.0954, lr=0.0001] Steps: 90%|████████▉ | 629/700 [04:32<00:30, 2.31it/s, loss=0.174, lr=0.0001] Steps: 90%|█████████ | 630/700 [04:33<00:30, 2.31it/s, loss=0.174, lr=0.0001] Steps: 90%|█████████ | 630/700 [04:33<00:30, 2.31it/s, loss=0.104, lr=0.0001] Steps: 90%|█████████ | 631/700 [04:33<00:29, 2.31it/s, loss=0.104, lr=0.0001] Steps: 90%|█████████ | 631/700 [04:33<00:29, 2.31it/s, loss=0.104, lr=0.0001] Steps: 90%|█████████ | 632/700 [04:34<00:29, 2.31it/s, loss=0.104, lr=0.0001] Steps: 90%|█████████ | 632/700 [04:34<00:29, 2.31it/s, loss=0.12, lr=0.0001] Steps: 90%|█████████ | 633/700 [04:34<00:28, 2.31it/s, loss=0.12, lr=0.0001] Steps: 90%|█████████ | 633/700 [04:34<00:28, 2.31it/s, loss=0.0687, lr=0.0001] Steps: 91%|█████████ | 634/700 [04:34<00:28, 2.31it/s, loss=0.0687, lr=0.0001] Steps: 91%|█████████ | 634/700 [04:34<00:28, 2.31it/s, loss=0.153, lr=0.0001] Steps: 91%|█████████ | 635/700 [04:35<00:28, 2.31it/s, loss=0.153, lr=0.0001] Steps: 91%|█████████ | 635/700 [04:35<00:28, 2.31it/s, loss=0.105, lr=0.0001] Steps: 91%|█████████ | 636/700 [04:35<00:27, 2.31it/s, loss=0.105, lr=0.0001] Steps: 91%|█████████ | 636/700 [04:35<00:27, 2.31it/s, loss=0.0692, lr=0.0001] Steps: 91%|█████████ | 637/700 [04:36<00:27, 2.30it/s, loss=0.0692, lr=0.0001] Steps: 91%|█████████ | 637/700 [04:36<00:27, 2.30it/s, loss=0.101, lr=0.0001] Steps: 91%|█████████ | 638/700 [04:36<00:26, 2.31it/s, loss=0.101, lr=0.0001] Steps: 91%|█████████ | 638/700 [04:36<00:26, 2.31it/s, loss=0.0891, lr=0.0001] Steps: 91%|█████████▏| 639/700 [04:37<00:26, 2.31it/s, loss=0.0891, lr=0.0001] Steps: 91%|█████████▏| 639/700 [04:37<00:26, 2.31it/s, loss=0.201, lr=0.0001] Steps: 91%|█████████▏| 640/700 [04:37<00:25, 2.31it/s, loss=0.201, lr=0.0001] Steps: 91%|█████████▏| 640/700 [04:37<00:25, 2.31it/s, loss=0.136, lr=0.0001] Steps: 92%|█████████▏| 641/700 [04:37<00:25, 2.31it/s, loss=0.136, lr=0.0001] Steps: 92%|█████████▏| 641/700 [04:37<00:25, 2.31it/s, loss=0.0821, lr=0.0001] Steps: 92%|█████████▏| 642/700 [04:38<00:25, 2.31it/s, loss=0.0821, lr=0.0001] Steps: 92%|█████████▏| 642/700 [04:38<00:25, 2.31it/s, loss=0.18, lr=0.0001] Steps: 92%|█████████▏| 643/700 [04:38<00:24, 2.31it/s, loss=0.18, lr=0.0001] Steps: 92%|█████████▏| 643/700 [04:38<00:24, 2.31it/s, loss=0.0533, lr=0.0001] Steps: 92%|█████████▏| 644/700 [04:39<00:24, 2.31it/s, loss=0.0533, lr=0.0001] Steps: 92%|█████████▏| 644/700 [04:39<00:24, 2.31it/s, loss=0.0746, lr=0.0001] Steps: 92%|█████████▏| 645/700 [04:39<00:23, 2.31it/s, loss=0.0746, lr=0.0001] Steps: 92%|█████████▏| 645/700 [04:39<00:23, 2.31it/s, loss=0.0947, lr=0.0001] Steps: 92%|█████████▏| 646/700 [04:40<00:23, 2.31it/s, loss=0.0947, lr=0.0001] Steps: 92%|█████████▏| 646/700 [04:40<00:23, 2.31it/s, loss=0.0792, lr=0.0001] Steps: 92%|█████████▏| 647/700 [04:40<00:22, 2.31it/s, loss=0.0792, lr=0.0001] Steps: 92%|█████████▏| 647/700 [04:40<00:22, 2.31it/s, loss=0.0432, lr=0.0001] Steps: 93%|█████████▎| 648/700 [04:40<00:22, 2.31it/s, loss=0.0432, lr=0.0001] Steps: 93%|█████████▎| 648/700 [04:41<00:22, 2.31it/s, loss=0.105, lr=0.0001] Steps: 93%|█████████▎| 649/700 [04:41<00:22, 2.30it/s, loss=0.105, lr=0.0001] Steps: 93%|█████████▎| 649/700 [04:41<00:22, 2.30it/s, loss=0.135, lr=0.0001] Steps: 93%|█████████▎| 650/700 [04:41<00:21, 2.31it/s, loss=0.135, lr=0.0001] Steps: 93%|█████████▎| 650/700 [04:41<00:21, 2.31it/s, loss=0.214, lr=0.0001] Steps: 93%|█████████▎| 651/700 [04:42<00:21, 2.30it/s, loss=0.214, lr=0.0001] Steps: 93%|█████████▎| 651/700 [04:42<00:21, 2.30it/s, loss=0.108, lr=0.0001] Steps: 93%|█████████▎| 652/700 [04:42<00:20, 2.31it/s, loss=0.108, lr=0.0001] Steps: 93%|█████████▎| 652/700 [04:42<00:20, 2.31it/s, loss=0.0568, lr=0.0001] Steps: 93%|█████████▎| 653/700 [04:43<00:20, 2.31it/s, loss=0.0568, lr=0.0001] Steps: 93%|█████████▎| 653/700 [04:43<00:20, 2.31it/s, loss=0.131, lr=0.0001] Steps: 93%|█████████▎| 654/700 [04:43<00:19, 2.31it/s, loss=0.131, lr=0.0001] Steps: 93%|█████████▎| 654/700 [04:43<00:19, 2.31it/s, loss=0.143, lr=0.0001] Steps: 94%|█████████▎| 655/700 [04:44<00:19, 2.31it/s, loss=0.143, lr=0.0001] Steps: 94%|█████████▎| 655/700 [04:44<00:19, 2.31it/s, loss=0.15, lr=0.0001] Steps: 94%|█████████▎| 656/700 [04:44<00:19, 2.31it/s, loss=0.15, lr=0.0001] Steps: 94%|█████████▎| 656/700 [04:44<00:19, 2.31it/s, loss=0.0982, lr=0.0001] Steps: 94%|█████████▍| 657/700 [04:44<00:18, 2.31it/s, loss=0.0982, lr=0.0001] Steps: 94%|█████████▍| 657/700 [04:44<00:18, 2.31it/s, loss=0.0432, lr=0.0001] Steps: 94%|█████████▍| 658/700 [04:45<00:18, 2.31it/s, loss=0.0432, lr=0.0001] Steps: 94%|█████████▍| 658/700 [04:45<00:18, 2.31it/s, loss=0.116, lr=0.0001] Steps: 94%|█████████▍| 659/700 [04:45<00:17, 2.31it/s, loss=0.116, lr=0.0001] Steps: 94%|█████████▍| 659/700 [04:45<00:17, 2.31it/s, loss=0.111, lr=0.0001] Steps: 94%|█████████▍| 660/700 [04:46<00:17, 2.31it/s, loss=0.111, lr=0.0001] Steps: 94%|█████████▍| 660/700 [04:46<00:17, 2.31it/s, loss=0.0972, lr=0.0001] Steps: 94%|█████████▍| 661/700 [04:46<00:16, 2.30it/s, loss=0.0972, lr=0.0001] Steps: 94%|█████████▍| 661/700 [04:46<00:16, 2.30it/s, loss=0.0867, lr=0.0001] Steps: 95%|█████████▍| 662/700 [04:47<00:16, 2.31it/s, loss=0.0867, lr=0.0001] Steps: 95%|█████████▍| 662/700 [04:47<00:16, 2.31it/s, loss=0.177, lr=0.0001] Steps: 95%|█████████▍| 663/700 [04:47<00:16, 2.31it/s, loss=0.177, lr=0.0001] Steps: 95%|█████████▍| 663/700 [04:47<00:16, 2.31it/s, loss=0.158, lr=0.0001] Steps: 95%|█████████▍| 664/700 [04:47<00:15, 2.31it/s, loss=0.158, lr=0.0001] Steps: 95%|█████████▍| 664/700 [04:47<00:15, 2.31it/s, loss=0.185, lr=0.0001] Steps: 95%|█████████▌| 665/700 [04:48<00:15, 2.31it/s, loss=0.185, lr=0.0001] Steps: 95%|█████████▌| 665/700 [04:48<00:15, 2.31it/s, loss=0.0858, lr=0.0001] Steps: 95%|█████████▌| 666/700 [04:48<00:14, 2.31it/s, loss=0.0858, lr=0.0001] Steps: 95%|█████████▌| 666/700 [04:48<00:14, 2.31it/s, loss=0.137, lr=0.0001] Steps: 95%|█████████▌| 667/700 [04:49<00:14, 2.31it/s, loss=0.137, lr=0.0001] Steps: 95%|█████████▌| 667/700 [04:49<00:14, 2.31it/s, loss=0.0444, lr=0.0001] Steps: 95%|█████████▌| 668/700 [04:49<00:13, 2.31it/s, loss=0.0444, lr=0.0001] Steps: 95%|█████████▌| 668/700 [04:49<00:13, 2.31it/s, loss=0.106, lr=0.0001] Steps: 96%|█████████▌| 669/700 [04:50<00:13, 2.31it/s, loss=0.106, lr=0.0001] Steps: 96%|█████████▌| 669/700 [04:50<00:13, 2.31it/s, loss=0.0327, lr=0.0001] Steps: 96%|█████████▌| 670/700 [04:50<00:12, 2.31it/s, loss=0.0327, lr=0.0001] Steps: 96%|█████████▌| 670/700 [04:50<00:12, 2.31it/s, loss=0.0921, lr=0.0001] Steps: 96%|█████████▌| 671/700 [04:50<00:12, 2.31it/s, loss=0.0921, lr=0.0001] Steps: 96%|█████████▌| 671/700 [04:50<00:12, 2.31it/s, loss=0.122, lr=0.0001] Steps: 96%|█████████▌| 672/700 [04:51<00:12, 2.31it/s, loss=0.122, lr=0.0001] Steps: 96%|█████████▌| 672/700 [04:51<00:12, 2.31it/s, loss=0.055, lr=0.0001] Steps: 96%|█████████▌| 673/700 [04:51<00:11, 2.30it/s, loss=0.055, lr=0.0001] Steps: 96%|█████████▌| 673/700 [04:51<00:11, 2.30it/s, loss=0.0406, lr=0.0001] Steps: 96%|█████████▋| 674/700 [04:52<00:11, 2.31it/s, loss=0.0406, lr=0.0001] Steps: 96%|█████████▋| 674/700 [04:52<00:11, 2.31it/s, loss=0.0989, lr=0.0001] Steps: 96%|█████████▋| 675/700 [04:52<00:10, 2.31it/s, loss=0.0989, lr=0.0001] Steps: 96%|█████████▋| 675/700 [04:52<00:10, 2.31it/s, loss=0.0807, lr=0.0001] Steps: 97%|█████████▋| 676/700 [04:53<00:10, 2.31it/s, loss=0.0807, lr=0.0001] Steps: 97%|█████████▋| 676/700 [04:53<00:10, 2.31it/s, loss=0.089, lr=0.0001] Steps: 97%|█████████▋| 677/700 [04:53<00:09, 2.31it/s, loss=0.089, lr=0.0001] Steps: 97%|█████████▋| 677/700 [04:53<00:09, 2.31it/s, loss=0.0711, lr=0.0001] Steps: 97%|█████████▋| 678/700 [04:53<00:09, 2.31it/s, loss=0.0711, lr=0.0001] Steps: 97%|█████████▋| 678/700 [04:54<00:09, 2.31it/s, loss=0.183, lr=0.0001] Steps: 97%|█████████▋| 679/700 [04:54<00:09, 2.31it/s, loss=0.183, lr=0.0001] Steps: 97%|█████████▋| 679/700 [04:54<00:09, 2.31it/s, loss=0.118, lr=0.0001] Steps: 97%|█████████▋| 680/700 [04:54<00:08, 2.31it/s, loss=0.118, lr=0.0001] Steps: 97%|█████████▋| 680/700 [04:54<00:08, 2.31it/s, loss=0.0774, lr=0.0001] Steps: 97%|█████████▋| 681/700 [04:55<00:08, 2.31it/s, loss=0.0774, lr=0.0001] Steps: 97%|█████████▋| 681/700 [04:55<00:08, 2.31it/s, loss=0.109, lr=0.0001] Steps: 97%|█████████▋| 682/700 [04:55<00:07, 2.31it/s, loss=0.109, lr=0.0001] Steps: 97%|█████████▋| 682/700 [04:55<00:07, 2.31it/s, loss=0.0268, lr=0.0001] Steps: 98%|█████████▊| 683/700 [04:56<00:07, 2.31it/s, loss=0.0268, lr=0.0001] Steps: 98%|█████████▊| 683/700 [04:56<00:07, 2.31it/s, loss=0.0848, lr=0.0001] Steps: 98%|█████████▊| 684/700 [04:56<00:06, 2.31it/s, loss=0.0848, lr=0.0001] Steps: 98%|█████████▊| 684/700 [04:56<00:06, 2.31it/s, loss=0.158, lr=0.0001] Steps: 98%|█████████▊| 685/700 [04:57<00:06, 2.30it/s, loss=0.158, lr=0.0001] Steps: 98%|█████████▊| 685/700 [04:57<00:06, 2.30it/s, loss=0.0609, lr=0.0001] Steps: 98%|█████████▊| 686/700 [04:57<00:06, 2.30it/s, loss=0.0609, lr=0.0001] Steps: 98%|█████████▊| 686/700 [04:57<00:06, 2.30it/s, loss=0.137, lr=0.0001] Steps: 98%|█████████▊| 687/700 [04:57<00:05, 2.31it/s, loss=0.137, lr=0.0001] Steps: 98%|█████████▊| 687/700 [04:57<00:05, 2.31it/s, loss=0.078, lr=0.0001] Steps: 98%|█████████▊| 688/700 [04:58<00:05, 2.30it/s, loss=0.078, lr=0.0001] Steps: 98%|█████████▊| 688/700 [04:58<00:05, 2.30it/s, loss=0.0719, lr=0.0001] Steps: 98%|█████████▊| 689/700 [04:58<00:04, 2.31it/s, loss=0.0719, lr=0.0001] Steps: 98%|█████████▊| 689/700 [04:58<00:04, 2.31it/s, loss=0.06, lr=0.0001] Steps: 99%|█████████▊| 690/700 [04:59<00:04, 2.31it/s, loss=0.06, lr=0.0001] Steps: 99%|█████████▊| 690/700 [04:59<00:04, 2.31it/s, loss=0.0883, lr=0.0001] Steps: 99%|█████████▊| 691/700 [04:59<00:03, 2.31it/s, loss=0.0883, lr=0.0001] Steps: 99%|█████████▊| 691/700 [04:59<00:03, 2.31it/s, loss=0.0885, lr=0.0001] Steps: 99%|█████████▉| 692/700 [05:00<00:03, 2.31it/s, loss=0.0885, lr=0.0001] Steps: 99%|█████████▉| 692/700 [05:00<00:03, 2.31it/s, loss=0.0699, lr=0.0001] Steps: 99%|█████████▉| 693/700 [05:00<00:03, 2.31it/s, loss=0.0699, lr=0.0001] Steps: 99%|█████████▉| 693/700 [05:00<00:03, 2.31it/s, loss=0.0816, lr=0.0001] Steps: 99%|█████████▉| 694/700 [05:00<00:02, 2.31it/s, loss=0.0816, lr=0.0001] Steps: 99%|█████████▉| 694/700 [05:00<00:02, 2.31it/s, loss=0.152, lr=0.0001] Steps: 99%|█████████▉| 695/700 [05:01<00:02, 2.31it/s, loss=0.152, lr=0.0001] Steps: 99%|█████████▉| 695/700 [05:01<00:02, 2.31it/s, loss=0.187, lr=0.0001] Steps: 99%|█████████▉| 696/700 [05:01<00:01, 2.31it/s, loss=0.187, lr=0.0001] Steps: 99%|█████████▉| 696/700 [05:01<00:01, 2.31it/s, loss=0.066, lr=0.0001] Steps: 100%|█████████▉| 697/700 [05:02<00:01, 2.30it/s, loss=0.066, lr=0.0001] Steps: 100%|█████████▉| 697/700 [05:02<00:01, 2.30it/s, loss=0.101, lr=0.0001] Steps: 100%|█████████▉| 698/700 [05:02<00:00, 2.30it/s, loss=0.101, lr=0.0001] Steps: 100%|█████████▉| 698/700 [05:02<00:00, 2.30it/s, loss=0.0533, lr=0.0001] Steps: 100%|█████████▉| 699/700 [05:03<00:00, 2.30it/s, loss=0.0533, lr=0.0001] Steps: 100%|█████████▉| 699/700 [05:03<00:00, 2.30it/s, loss=0.101, lr=0.0001] Steps: 100%|██████████| 700/700 [05:03<00:00, 2.31it/s, loss=0.101, lr=0.0001] Steps: 100%|██████████| 700/700 [05:03<00:00, 2.31it/s, loss=0.0975, lr=0.0001]Model weights saved in /tmp/train/output/sd35_large_train_replicate/pytorch_lora_weights.safetensors Loading pipeline components...: 0%| | 0/9 [00:00<?, ?it/s]Loaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of stable-diffusion-3.5-large. {'use_dynamic_shifting', 'max_shift', 'base_shift', 'base_image_seq_len', 'max_image_seq_len'} was not found in config. Values will be initialized to default values. Loaded scheduler as FlowMatchEulerDiscreteScheduler from `scheduler` subfolder of stable-diffusion-3.5-large. Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of stable-diffusion-3.5-large. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|█████ | 1/2 [00:05<00:05, 5.05s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.79s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.83s/it] Loaded text_encoder_3 as T5EncoderModel from `text_encoder_3` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 44%|████▍ | 4/9 [00:09<00:12, 2.45s/it]Loaded text_encoder_2 as CLIPTextModelWithProjection from `text_encoder_2` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 56%|█████▌ | 5/9 [00:11<00:08, 2.20s/it]Loaded text_encoder as CLIPTextModelWithProjection from `text_encoder` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 67%|██████▋ | 6/9 [00:11<00:05, 1.70s/it]Loaded tokenizer_3 as T5TokenizerFast from `tokenizer_3` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 78%|███████▊ | 7/9 [00:11<00:02, 1.28s/it]Loaded vae as AutoencoderKL from `vae` subfolder of stable-diffusion-3.5-large. {'dual_attention_layers'} was not found in config. Values will be initialized to default values. Loaded transformer as SD3Transformer2DModel from `transformer` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 100%|██████████| 9/9 [00:13<00:00, 1.14s/it] Loading pipeline components...: 100%|██████████| 9/9 [00:13<00:00, 1.53s/it] 0%| | 0/1 [00:00<?, ?it/s] 100%|██████████| 1/1 [00:00<00:00, 1.41it/s] 100%|██████████| 1/1 [00:00<00:00, 1.41it/s] Steps: 100%|██████████| 700/700 [05:20<00:00, 2.19it/s, loss=0.0975, lr=0.0001] ./ ./output/ ./output/sd35_large_train_replicate/ ./output/sd35_large_train_replicate/README.md ./output/sd35_large_train_replicate/lora.safetensors
Prediction
lucataco/stable-diffusion-3.5-large-lora-trainer:6ebda45af5b9c30edee3149cc1624b7f7cae8fab7c692e2c51d82f5fed3198eeID6ty97311w9rj00cjx4pbwngcp0StatusSucceededSourceAPIHardwareA100 (80GB)Total durationCreatedby @lucatacoInput
- rank
- 16
- backend
- no
- hf_token
- ████████████████████
This value was redacted after being sent to the model.
- optimizer
- prodigy
- resolution
- 768
- hub_model_id
- lucataco/SD3.5-Large-yarn-2
- input_images
- yarn.zip
- lr_scheduler
- constant
- learning_rate
- 1
- instance_prompt
- Frog, yarn art style
- max_train_steps
- 1500
- text_encoder_lr
- 1
- train_batch_size
- 1
- train_text_encoder
- gradient_accumulation_steps
- 1
{ "rank": 16, "backend": "no", "hf_token": "[REDACTED]", "optimizer": "prodigy", "resolution": 768, "hub_model_id": "lucataco/SD3.5-Large-yarn-2", "input_images": "https://replicate.delivery/pbxt/LrJveDd3TVKraYSxEWkMl0txKP39KdIBof5EO2IAsuTNIrFU/yarn.zip", "lr_scheduler": "constant", "learning_rate": 1, "instance_prompt": "Frog, yarn art style", "max_train_steps": 1500, "text_encoder_lr": 1, "train_batch_size": 1, "train_text_encoder": true, "gradient_accumulation_steps": 1 }
Install Replicate’s Node.js client library:npm install replicate
Import and set up the client:import Replicate from "replicate"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN, });
Run lucataco/stable-diffusion-3.5-large-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const output = await replicate.run( "lucataco/stable-diffusion-3.5-large-lora-trainer:6ebda45af5b9c30edee3149cc1624b7f7cae8fab7c692e2c51d82f5fed3198ee", { input: { rank: 16, backend: "no", hf_token: "[REDACTED]", optimizer: "prodigy", resolution: 768, hub_model_id: "lucataco/SD3.5-Large-yarn-2", input_images: "https://replicate.delivery/pbxt/LrJveDd3TVKraYSxEWkMl0txKP39KdIBof5EO2IAsuTNIrFU/yarn.zip", lr_scheduler: "constant", learning_rate: 1, instance_prompt: "Frog, yarn art style", max_train_steps: 1500, text_encoder_lr: 1, train_batch_size: 1, train_text_encoder: true, gradient_accumulation_steps: 1 } } ); // To access the file URL: console.log(output.url()); //=> "http://example.com" // To write the file to disk: fs.writeFile("my-image.png", output);
To learn more, take a look at the guide on getting started with Node.js.
Install Replicate’s Python client library:pip install replicate
Import the client:import replicate
Run lucataco/stable-diffusion-3.5-large-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run( "lucataco/stable-diffusion-3.5-large-lora-trainer:6ebda45af5b9c30edee3149cc1624b7f7cae8fab7c692e2c51d82f5fed3198ee", input={ "rank": 16, "backend": "no", "hf_token": "[REDACTED]", "optimizer": "prodigy", "resolution": 768, "hub_model_id": "lucataco/SD3.5-Large-yarn-2", "input_images": "https://replicate.delivery/pbxt/LrJveDd3TVKraYSxEWkMl0txKP39KdIBof5EO2IAsuTNIrFU/yarn.zip", "lr_scheduler": "constant", "learning_rate": 1, "instance_prompt": "Frog, yarn art style", "max_train_steps": 1500, "text_encoder_lr": 1, "train_batch_size": 1, "train_text_encoder": True, "gradient_accumulation_steps": 1 } ) print(output)
To learn more, take a look at the guide on getting started with Python.
Run lucataco/stable-diffusion-3.5-large-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H "Content-Type: application/json" \ -H "Prefer: wait" \ -d $'{ "version": "6ebda45af5b9c30edee3149cc1624b7f7cae8fab7c692e2c51d82f5fed3198ee", "input": { "rank": 16, "backend": "no", "hf_token": "[REDACTED]", "optimizer": "prodigy", "resolution": 768, "hub_model_id": "lucataco/SD3.5-Large-yarn-2", "input_images": "https://replicate.delivery/pbxt/LrJveDd3TVKraYSxEWkMl0txKP39KdIBof5EO2IAsuTNIrFU/yarn.zip", "lr_scheduler": "constant", "learning_rate": 1, "instance_prompt": "Frog, yarn art style", "max_train_steps": 1500, "text_encoder_lr": 1, "train_batch_size": 1, "train_text_encoder": true, "gradient_accumulation_steps": 1 } }' \ https://api.replicate.com/v1/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
Output
{ "completed_at": "2024-11-01T19:38:58.822748Z", "created_at": "2024-11-01T19:20:36.578000Z", "data_removed": false, "error": null, "id": "6ty97311w9rj00cjx4pbwngcp0", "input": { "rank": 16, "backend": "no", "hf_token": "[REDACTED]", "optimizer": "prodigy", "resolution": 768, "hub_model_id": "lucataco/SD3.5-Large-yarn-2", "input_images": "https://replicate.delivery/pbxt/LrJveDd3TVKraYSxEWkMl0txKP39KdIBof5EO2IAsuTNIrFU/yarn.zip", "lr_scheduler": "constant", "learning_rate": 1, "instance_prompt": "Frog, yarn art style", "max_train_steps": 1500, "text_encoder_lr": 1, "train_batch_size": 1, "train_text_encoder": true, "gradient_accumulation_steps": 1 }, "logs": "Using seed: 2592116838\nExtracted 16 files from zip to input_images\nThe token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.\nToken is valid (permission: write).\nYour token has been saved to /root/.cache/huggingface/token\nLogin successful\nUsing params: ['accelerate', 'launch', '--dynamo_backend', 'no', 'train_dreambooth_lora_sd3.py', '--pretrained_model_name_or_path', 'stable-diffusion-3.5-large', '--instance_data_dir', 'input_images', '--rank', '16', '--output_dir', '/tmp/train/output/sd35_large_train_replicate', '--mixed_precision', 'bf16', '--instance_prompt', 'Frog, yarn art style', '--resolution', '768', '--train_batch_size', '1', '--gradient_accumulation_steps', '1', '--optimizer', 'prodigy', '--learning_rate', '1.0', '--text_encoder_lr', '1.0', '--lr_scheduler', 'constant', '--lr_warmup_steps', '0', '--max_train_steps', '1500', '--checkpointing_steps', '1501', '--seed', '2592116838', '--logging_dir', '/tmp/logs', '--push_to_hub', '--hub_token', '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', '--hub_model_id', 'lucataco/SD3.5-Large-yarn-2', '--train_text_encoder']\n11/01/2024 19:21:43 - INFO - __main__ - Distributed environment: DistributedType.NO\nNum processes: 1\nProcess index: 0\nLocal process index: 0\nDevice: cuda\nMixed precision type: bf16\nYou set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers\nYou are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.\nYou are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.\nYou are using a model of type t5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.\n{'max_image_seq_len', 'base_image_seq_len', 'max_shift', 'base_shift', 'use_dynamic_shifting'} was not found in config. Values will be initialized to default values.\nLoading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]\nLoading checkpoint shards: 50%|█████ | 1/2 [00:03<00:03, 3.58s/it]\nLoading checkpoint shards: 100%|██████████| 2/2 [00:06<00:00, 3.44s/it]\nLoading checkpoint shards: 100%|██████████| 2/2 [00:06<00:00, 3.46s/it]\n{'dual_attention_layers'} was not found in config. Values will be initialized to default values.\n11/01/2024 19:22:26 - WARNING - __main__ - Learning rates were provided both for the transformer and the text encoder- e.g. text_encoder_lr: 1.0 and learning_rate: 1.0. When using prodigy only learning_rate is used as the initial learning rate.\nUsing decoupled weight decay\n11/01/2024 19:22:27 - INFO - __main__ - ***** Running training *****\n11/01/2024 19:22:27 - INFO - __main__ - Num examples = 16\n11/01/2024 19:22:27 - INFO - __main__ - Num batches each epoch = 16\n11/01/2024 19:22:27 - INFO - __main__ - Num Epochs = 94\n11/01/2024 19:22:27 - INFO - __main__ - Instantaneous batch size per device = 1\n11/01/2024 19:22:27 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 1\n11/01/2024 19:22:27 - INFO - __main__ - Gradient Accumulation steps = 1\n11/01/2024 19:22:27 - INFO - __main__ - Total optimization steps = 1500\nSteps: 0%| | 0/1500 [00:00<?, ?it/s]\nSteps: 0%| | 1/1500 [00:01<42:23, 1.70s/it]\nSteps: 0%| | 1/1500 [00:01<42:23, 1.70s/it, loss=0.0865, lr=1]\nSteps: 0%| | 2/1500 [00:02<26:52, 1.08s/it, loss=0.0865, lr=1]\nSteps: 0%| | 2/1500 [00:02<26:52, 1.08s/it, loss=0.192, lr=1] \nSteps: 0%| | 3/1500 [00:02<21:54, 1.14it/s, loss=0.192, lr=1]\nSteps: 0%| | 3/1500 [00:02<21:54, 1.14it/s, loss=0.0926, lr=1]\nSteps: 0%| | 4/1500 [00:03<19:34, 1.27it/s, loss=0.0926, lr=1]\nSteps: 0%| | 4/1500 [00:03<19:34, 1.27it/s, loss=0.113, lr=1] \nSteps: 0%| | 5/\n[...] log volume exceeds 256KiB size limit: truncating logs [...]\n0 [00:30<15:39, 1.55it/s, loss=0.205, lr=1]\nSteps: 3%|▎ | 46/1500 [00:30<15:39, 1.55it/s, loss=0.26, lr=1] \nSteps: 3%|▎ | 47/1500 [00:31<15:39, 1.55it/s, loss=0.26, lr=1]\nSteps: 3%|▎ | 47/1500 [00:31<15:39, 1.55it/s, loss=0.148, lr=1]\nSteps: 3%|▎ | 48/1500 [00:32<15:38, 1.55it/s, loss=0.148, lr=1]\nSteps: 3%|▎ | 48/1500 [00:32<15:38, 1.55it/s, loss=0.264, lr=1]\nSteps: 3%|▎ | 49/1500 [00:32<15:44, 1.54it/s, loss=0.264, lr=1]\nSteps: 3%|▎ | 49/1500 [00:32<15:44, 1.54it/s, loss=0.0476, lr=1]\nSteps: 3%|▎ | 50/1500 [00:33<15:41, 1.54it/s, loss=0.0476, lr=1]\nSteps: 3%|▎ | 50/1500 [00:33<15:41, 1.54it/s, loss=0.0878, lr=1]\nSteps: 3%|▎ | 51/1500 [00:34<15:41, 1.54it/s, loss=0.0878, lr=1]\nSteps: 3%|▎ | 51/1500 [00:34<15:41, 1.54it/s, loss=0.201, lr=1] \nSteps: 3%|▎ | 52/1500 [00:34<15:41, 1.54it/s, loss=0.201, lr=1]\nSteps: 3%|▎ | 52/1500 [00:34<15:41, 1.54it/s, loss=0.114, lr=1]\nSteps: 4%|▎ | 53/1500 [00:35<15:38, 1.54it/s, loss=0.114, lr=1]\nSteps: 4%|▎ | 53/1500 [00:35<15:38, 1.54it/s, loss=0.165, lr=1]\nSteps: 4%|▎ | 54/1500 [00:36<15:36, 1.54it/s, loss=0.165, lr=1]\nSteps: 4%|▎ | 54/1500 [00:36<15:36, 1.54it/s, loss=0.148, lr=1]\nSteps: 4%|▎ | 55/1500 [00:36<15:35, 1.55it/s, loss=0.148, lr=1]\nSteps: 4%|▎ | 55/1500 [00:36<15:35, 1.55it/s, loss=0.225, lr=1]\nSteps: 4%|▎ | 56/1500 [00:37<15:33, 1.55it/s, loss=0.225, lr=1]\nSteps: 4%|▎ | 56/1500 [00:37<15:33, 1.55it/s, loss=0.242, lr=1]\nSteps: 4%|▍ | 57/1500 [00:38<15:32, 1.55it/s, loss=0.242, lr=1]\nSteps: 4%|▍ | 57/1500 [00:38<15:32, 1.55it/s, loss=0.0901, lr=1]\nSteps: 4%|▍ | 58/1500 [00:38<15:30, 1.55it/s, loss=0.0901, lr=1]\nSteps: 4%|▍ | 58/1500 [00:38<15:30, 1.55it/s, loss=0.321, lr=1] \nSteps: 4%|▍ | 59/1500 [00:39<15:30, 1.55it/s, loss=0.321, lr=1]\nSteps: 4%|▍ | 59/1500 [00:39<15:30, 1.55it/s, loss=0.105, lr=1]\nSteps: 4%|▍ | 60/1500 [00:39<15:29, 1.55it/s, loss=0.105, lr=1]\nSteps: 4%|▍ | 60/1500 [00:39<15:29, 1.55it/s, loss=0.145, lr=1]\nSteps: 4%|▍ | 61/1500 [00:40<15:29, 1.55it/s, loss=0.145, lr=1]\nSteps: 4%|▍ | 61/1500 [00:40<15:29, 1.55it/s, loss=0.108, lr=1]\nSteps: 4%|▍ | 62/1500 [00:41<15:29, 1.55it/s, loss=0.108, lr=1]\nSteps: 4%|▍ | 62/1500 [00:41<15:29, 1.55it/s, loss=0.142, lr=1]\nSteps: 4%|▍ | 63/1500 [00:41<15:28, 1.55it/s, loss=0.142, lr=1]\nSteps: 4%|▍ | 63/1500 [00:41<15:28, 1.55it/s, loss=0.311, lr=1]\nSteps: 4%|▍ | 64/1500 [00:42<15:27, 1.55it/s, loss=0.311, lr=1]\nSteps: 4%|▍ | 64/1500 [00:42<15:27, 1.55it/s, loss=0.178, lr=1]\nSteps: 4%|▍ | 65/1500 [00:43<15:32, 1.54it/s, loss=0.178, lr=1]\nSteps: 4%|▍ | 65/1500 [00:43<15:32, 1.54it/s, loss=0.221, lr=1]\nSteps: 4%|▍ | 66/1500 [00:43<15:29, 1.54it/s, loss=0.221, lr=1]\nSteps: 4%|▍ | 66/1500 [00:43<15:29, 1.54it/s, loss=0.121, lr=1]\nSteps: 4%|▍ | 67/1500 [00:44<15:28, 1.54it/s, loss=0.121, lr=1]\nSteps: 4%|▍ | 67/1500 [00:44<15:28, 1.54it/s, loss=0.151, lr=1]\nSteps: 5%|▍ | 68/1500 [00:45<15:26, 1.55it/s, loss=0.151, lr=1]\nSteps: 5%|▍ | 68/1500 [00:45<15:26, 1.55it/s, loss=0.123, lr=1]\nSteps: 5%|▍ | 69/1500 [00:45<15:25, 1.55it/s, loss=0.123, lr=1]\nSteps: 5%|▍ | 69/1500 [00:45<15:25, 1.55it/s, loss=0.164, lr=1]\nSteps: 5%|▍ | 70/1500 [00:46<15:23, 1.55it/s, loss=0.164, lr=1]\nSteps: 5%|▍ | 70/1500 [00:46<15:23, 1.55it/s, loss=0.223, lr=1]\nSteps: 5%|▍ | 71/1500 [00:47<15:22, 1.55it/s, loss=0.223, lr=1]\nSteps: 5%|▍ | 71/1500 [00:47<15:22, 1.55it/s, loss=0.121, lr=1]\nSteps: 5%|▍ | 72/1500 [00:47<15:20, 1.55it/s, loss=0.121, lr=1]\nSteps: 5%|▍ | 72/1500 [00:47<15:20, 1.55it/s, loss=0.172, lr=1]\nSteps: 5%|▍ | 73/1500 [00:48<15:19, 1.55it/s, loss=0.172, lr=1]\nSteps: 5%|▍ | 73/1500 [00:48<15:19, 1.55it/s, loss=0.185, lr=1]\nSteps: 5%|▍ | 74/1500 [00:49<15:17, 1.55it/s, loss=0.185, lr=1]\nSteps: 5%|▍ | 74/1500 [00:49<15:17, 1.55it/s, loss=0.116, lr=1]\nSteps: 5%|▌ | 75/1500 [00:49<15:18, 1.55it/s, loss=0.116, lr=1]\nSteps: 5%|▌ | 75/1500 [00:49<15:18, 1.55it/s, loss=0.0638, lr=1]\nSteps: 5%|▌ | 76/1500 [00:50<15:18, 1.55it/s, loss=0.0638, lr=1]\nSteps: 5%|▌ | 76/1500 [00:50<15:18, 1.55it/s, loss=0.192, lr=1] \nSteps: 5%|▌ | 77/1500 [00:50<15:16, 1.55it/s, loss=0.192, lr=1]\nSteps: 5%|▌ | 77/1500 [00:50<15:16, 1.55it/s, loss=0.0889, lr=1]\nSteps: 5%|▌ | 78/1500 [00:51<15:16, 1.55it/s, loss=0.0889, lr=1]\nSteps: 5%|▌ | 78/1500 [00:51<15:16, 1.55it/s, loss=0.156, lr=1] \nSteps: 5%|▌ | 79/1500 [00:52<15:15, 1.55it/s, loss=0.156, lr=1]\nSteps: 5%|▌ | 79/1500 [00:52<15:15, 1.55it/s, loss=0.0374, lr=1]\nSteps: 5%|▌ | 80/1500 [00:52<15:13, 1.55it/s, loss=0.0374, lr=1]\nSteps: 5%|▌ | 80/1500 [00:52<15:13, 1.55it/s, loss=0.0846, lr=1]\nSteps: 5%|▌ | 81/1500 [00:53<15:18, 1.54it/s, loss=0.0846, lr=1]\nSteps: 5%|▌ | 81/1500 [00:53<15:18, 1.54it/s, loss=0.143, lr=1] \nSteps: 5%|▌ | 82/1500 [00:54<15:19, 1.54it/s, loss=0.143, lr=1]\nSteps: 5%|▌ | 82/1500 [00:54<15:19, 1.54it/s, loss=0.0692, lr=1]\nSteps: 6%|▌ | 83/1500 [00:54<15:18, 1.54it/s, loss=0.0692, lr=1]\nSteps: 6%|▌ | 83/1500 [00:54<15:18, 1.54it/s, loss=0.214, lr=1] \nSteps: 6%|▌ | 84/1500 [00:55<15:16, 1.54it/s, loss=0.214, lr=1]\nSteps: 6%|▌ | 84/1500 [00:55<15:16, 1.54it/s, loss=0.0942, lr=1]\nSteps: 6%|▌ | 85/1500 [00:56<15:14, 1.55it/s, loss=0.0942, lr=1]\nSteps: 6%|▌ | 85/1500 [00:56<15:14, 1.55it/s, loss=0.207, lr=1] \nSteps: 6%|▌ | 86/1500 [00:56<15:13, 1.55it/s, loss=0.207, lr=1]\nSteps: 6%|▌ | 86/1500 [00:56<15:13, 1.55it/s, loss=0.255, lr=1]\nSteps: 6%|▌ | 87/1500 [00:57<15:12, 1.55it/s, loss=0.255, lr=1]\nSteps: 6%|▌ | 87/1500 [00:57<15:12, 1.55it/s, loss=0.111, lr=1]\nSteps: 6%|▌ | 88/1500 [00:58<15:12, 1.55it/s, loss=0.111, lr=1]\nSteps: 6%|▌ | 88/1500 [00:58<15:12, 1.55it/s, loss=0.0414, lr=1]\nSteps: 6%|▌ | 89/1500 [00:58<15:10, 1.55it/s, loss=0.0414, lr=1]\nSteps: 6%|▌ | 89/1500 [00:58<15:10, 1.55it/s, loss=0.192, lr=1] \nSteps: 6%|▌ | 90/1500 [00:59<15:09, 1.55it/s, loss=0.192, lr=1]\nSteps: 6%|▌ | 90/1500 [00:59<15:09, 1.55it/s, loss=0.105, lr=1]\nSteps: 6%|▌ | 91/1500 [00:59<15:09, 1.55it/s, loss=0.105, lr=1]\nSteps: 6%|▌ | 91/1500 [00:59<15:09, 1.55it/s, loss=0.131, lr=1]\nSteps: 6%|▌ | 92/1500 [01:00<15:08, 1.55it/s, loss=0.131, lr=1]\nSteps: 6%|▌ | 92/1500 [01:00<15:08, 1.55it/s, loss=0.0768, lr=1]\nSteps: 6%|▌ | 93/1500 [01:01<15:11, 1.54it/s, loss=0.0768, lr=1]\nSteps: 6%|▌ | 93/1500 [01:01<15:11, 1.54it/s, loss=0.0205, lr=1]\nSteps: 6%|▋ | 94/1500 [01:01<15:08, 1.55it/s, loss=0.0205, lr=1]\nSteps: 6%|▋ | 94/1500 [01:01<15:08, 1.55it/s, loss=0.137, lr=1] \nSteps: 6%|▋ | 95/1500 [01:02<15:08, 1.55it/s, loss=0.137, lr=1]\nSteps: 6%|▋ | 95/1500 [01:02<15:08, 1.55it/s, loss=0.0869, lr=1]\nSteps: 6%|▋ | 96/1500 [01:03<15:05, 1.55it/s, loss=0.0869, lr=1]\nSteps: 6%|▋ | 96/1500 [01:03<15:05, 1.55it/s, loss=0.125, lr=1] \nSteps: 6%|▋ | 97/1500 [01:03<15:11, 1.54it/s, loss=0.125, lr=1]\nSteps: 6%|▋ | 97/1500 [01:03<15:11, 1.54it/s, loss=0.169, lr=1]\nSteps: 7%|▋ | 98/1500 [01:04<15:10, 1.54it/s, loss=0.169, lr=1]\nSteps: 7%|▋ | 98/1500 [01:04<15:10, 1.54it/s, loss=0.148, lr=1]\nSteps: 7%|▋ | 99/1500 [01:05<15:07, 1.54it/s, loss=0.148, lr=1]\nSteps: 7%|▋ | 99/1500 [01:05<15:07, 1.54it/s, loss=0.311, lr=1]\nSteps: 7%|▋ | 100/1500 [01:05<15:10, 1.54it/s, loss=0.311, lr=1]\nSteps: 7%|▋ | 100/1500 [01:05<15:10, 1.54it/s, loss=0.107, lr=1]\nSteps: 7%|▋ | 101/1500 [01:06<15:08, 1.54it/s, loss=0.107, lr=1]\nSteps: 7%|▋ | 101/1500 [01:06<15:08, 1.54it/s, loss=0.17, lr=1] \nSteps: 7%|▋ | 102/1500 [01:07<15:09, 1.54it/s, loss=0.17, lr=1]\nSteps: 7%|▋ | 102/1500 [01:07<15:09, 1.54it/s, loss=0.0375, lr=1]\nSteps: 7%|▋ | 103/1500 [01:07<15:06, 1.54it/s, loss=0.0375, lr=1]\nSteps: 7%|▋ | 103/1500 [01:07<15:06, 1.54it/s, loss=0.168, lr=1] \nSteps: 7%|▋ | 104/1500 [01:08<15:04, 1.54it/s, loss=0.168, lr=1]\nSteps: 7%|▋ | 104/1500 [01:08<15:04, 1.54it/s, loss=0.111, lr=1]\nSteps: 7%|▋ | 105/1500 [01:09<15:01, 1.55it/s, loss=0.111, lr=1]\nSteps: 7%|▋ | 105/1500 [01:09<15:01, 1.55it/s, loss=0.0957, lr=1]\nSteps: 7%|▋ | 106/1500 [01:09<15:00, 1.55it/s, loss=0.0957, lr=1]\nSteps: 7%|▋ | 106/1500 [01:09<15:00, 1.55it/s, loss=0.392, lr=1] \nSteps: 7%|▋ | 107/1500 [01:10<14:59, 1.55it/s, loss=0.392, lr=1]\nSteps: 7%|▋ | 107/1500 [01:10<14:59, 1.55it/s, loss=0.279, lr=1]\nSteps: 7%|▋ | 108/1500 [01:11<15:00, 1.55it/s, loss=0.279, lr=1]\nSteps: 7%|▋ | 108/1500 [01:11<15:00, 1.55it/s, loss=0.331, lr=1]\nSteps: 7%|▋ | 109/1500 [01:11<14:59, 1.55it/s, loss=0.331, lr=1]\nSteps: 7%|▋ | 109/1500 [01:11<14:59, 1.55it/s, loss=0.236, lr=1]\nSteps: 7%|▋ | 110/1500 [01:12<14:59, 1.55it/s, loss=0.236, lr=1]\nSteps: 7%|▋ | 110/1500 [01:12<14:59, 1.55it/s, loss=0.226, lr=1]\nSteps: 7%|▋ | 111/1500 [01:12<14:58, 1.55it/s, loss=0.226, lr=1]\nSteps: 7%|▋ | 111/1500 [01:12<14:58, 1.55it/s, loss=0.0534, lr=1]\nSteps: 7%|▋ | 112/1500 [01:13<14:58, 1.54it/s, loss=0.0534, lr=1]\nSteps: 7%|▋ | 112/1500 [01:13<14:58, 1.54it/s, loss=0.264, lr=1] \nSteps: 8%|▊ | 113/1500 [01:14<15:03, 1.53it/s, loss=0.264, lr=1]\nSteps: 8%|▊ | 113/1500 [01:14<15:03, 1.53it/s, loss=0.284, lr=1]\nSteps: 8%|▊ | 114/1500 [01:14<15:01, 1.54it/s, loss=0.284, lr=1]\nSteps: 8%|▊ | 114/1500 [01:14<15:01, 1.54it/s, loss=0.138, lr=1]\nSteps: 8%|▊ | 115/1500 [01:15<14:58, 1.54it/s, loss=0.138, lr=1]\nSteps: 8%|▊ | 115/1500 [01:15<14:58, 1.54it/s, loss=0.0913, lr=1]\nSteps: 8%|▊ | 116/1500 [01:16<14:57, 1.54it/s, loss=0.0913, lr=1]\nSteps: 8%|▊ | 116/1500 [01:16<14:57, 1.54it/s, loss=0.0745, lr=1]\nSteps: 8%|▊ | 117/1500 [01:16<14:56, 1.54it/s, loss=0.0745, lr=1]\nSteps: 8%|▊ | 117/1500 [01:16<14:56, 1.54it/s, loss=0.124, lr=1] \nSteps: 8%|▊ | 118/1500 [01:17<14:56, 1.54it/s, loss=0.124, lr=1]\nSteps: 8%|▊ | 118/1500 [01:17<14:56, 1.54it/s, loss=0.22, lr=1] \nSteps: 8%|▊ | 119/1500 [01:18<14:55, 1.54it/s, loss=0.22, lr=1]\nSteps: 8%|▊ | 119/1500 [01:18<14:55, 1.54it/s, loss=0.115, lr=1]\nSteps: 8%|▊ | 120/1500 [01:18<14:53, 1.55it/s, loss=0.115, lr=1]\nSteps: 8%|▊ | 120/1500 [01:18<14:53, 1.55it/s, loss=0.137, lr=1]\nSteps: 8%|▊ | 121/1500 [01:19<14:51, 1.55it/s, loss=0.137, lr=1]\nSteps: 8%|▊ | 121/1500 [01:19<14:51, 1.55it/s, loss=0.223, lr=1]\nSteps: 8%|▊ | 122/1500 [01:20<14:49, 1.55it/s, loss=0.223, lr=1]\nSteps: 8%|▊ | 122/1500 [01:20<14:49, 1.55it/s, loss=0.121, lr=1]\nSteps: 8%|▊ | 123/1500 [01:20<14:48, 1.55it/s, loss=0.121, lr=1]\nSteps: 8%|▊ | 123/1500 [01:20<14:48, 1.55it/s, loss=0.332, lr=1]\nSteps: 8%|▊ | 124/1500 [01:21<14:47, 1.55it/s, loss=0.332, lr=1]\nSteps: 8%|▊ | 124/1500 [01:21<14:47, 1.55it/s, loss=0.054, lr=1]\nSteps: 8%|▊ | 125/1500 [01:22<14:46, 1.55it/s, loss=0.054, lr=1]\nSteps: 8%|▊ | 125/1500 [01:22<14:46, 1.55it/s, loss=0.109, lr=1]\nSteps: 8%|▊ | 126/1500 [01:22<14:44, 1.55it/s, loss=0.109, lr=1]\nSteps: 8%|▊ | 126/1500 [01:22<14:44, 1.55it/s, loss=0.302, lr=1]\nSteps: 8%|▊ | 127/1500 [01:23<14:45, 1.55it/s, loss=0.302, lr=1]\nSteps: 8%|▊ | 127/1500 [01:23<14:45, 1.55it/s, loss=0.127, lr=1]\nSteps: 9%|▊ | 128/1500 [01:23<14:45, 1.55it/s, loss=0.127, lr=1]\nSteps: 9%|▊ | 128/1500 [01:23<14:45, 1.55it/s, loss=0.064, lr=1]\nSteps: 9%|▊ | 129/1500 [01:24<14:49, 1.54it/s, loss=0.064, lr=1]\nSteps: 9%|▊ | 129/1500 [01:24<14:49, 1.54it/s, loss=0.158, lr=1]\nSteps: 9%|▊ | 130/1500 [01:25<14:47, 1.54it/s, loss=0.158, lr=1]\nSteps: 9%|▊ | 130/1500 [01:25<14:47, 1.54it/s, loss=0.11, lr=1] \nSteps: 9%|▊ | 131/1500 [01:25<14:46, 1.54it/s, loss=0.11, lr=1]\nSteps: 9%|▊ | 131/1500 [01:25<14:46, 1.54it/s, loss=0.0911, lr=1]\nSteps: 9%|▉ | 132/1500 [01:26<14:44, 1.55it/s, loss=0.0911, lr=1]\nSteps: 9%|▉ | 132/1500 [01:26<14:44, 1.55it/s, loss=0.202, lr=1] \nSteps: 9%|▉ | 133/1500 [01:27<14:42, 1.55it/s, loss=0.202, lr=1]\nSteps: 9%|▉ | 133/1500 [01:27<14:42, 1.55it/s, loss=0.136, lr=1]\nSteps: 9%|▉ | 134/1500 [01:27<14:41, 1.55it/s, loss=0.136, lr=1]\nSteps: 9%|▉ | 134/1500 [01:27<14:41, 1.55it/s, loss=0.159, lr=1]\nSteps: 9%|▉ | 135/1500 [01:28<14:41, 1.55it/s, loss=0.159, lr=1]\nSteps: 9%|▉ | 135/1500 [01:28<14:41, 1.55it/s, loss=0.128, lr=1]\nSteps: 9%|▉ | 136/1500 [01:29<14:39, 1.55it/s, loss=0.128, lr=1]\nSteps: 9%|▉ | 136/1500 [01:29<14:39, 1.55it/s, loss=0.256, lr=1]\nSteps: 9%|▉ | 137/1500 [01:29<14:38, 1.55it/s, loss=0.256, lr=1]\nSteps: 9%|▉ | 137/1500 [01:29<14:38, 1.55it/s, loss=0.0531, lr=1]\nSteps: 9%|▉ | 138/1500 [01:30<14:38, 1.55it/s, loss=0.0531, lr=1]\nSteps: 9%|▉ | 138/1500 [01:30<14:38, 1.55it/s, loss=0.126, lr=1] \nSteps: 9%|▉ | 139/1500 [01:31<14:38, 1.55it/s, loss=0.126, lr=1]\nSteps: 9%|▉ | 139/1500 [01:31<14:38, 1.55it/s, loss=0.13, lr=1] \nSteps: 9%|▉ | 140/1500 [01:31<14:36, 1.55it/s, loss=0.13, lr=1]\nSteps: 9%|▉ | 140/1500 [01:31<14:36, 1.55it/s, loss=0.154, lr=1]\nSteps: 9%|▉ | 141/1500 [01:32<14:36, 1.55it/s, loss=0.154, lr=1]\nSteps: 9%|▉ | 141/1500 [01:32<14:36, 1.55it/s, loss=0.242, lr=1]\nSteps: 9%|▉ | 142/1500 [01:32<14:36, 1.55it/s, loss=0.242, lr=1]\nSteps: 9%|▉ | 142/1500 [01:32<14:36, 1.55it/s, loss=0.194, lr=1]\nSteps: 10%|▉ | 143/1500 [01:33<14:35, 1.55it/s, loss=0.194, lr=1]\nSteps: 10%|▉ | 143/1500 [01:33<14:35, 1.55it/s, loss=0.254, lr=1]\nSteps: 10%|▉ | 144/1500 [01:34<14:35, 1.55it/s, loss=0.254, lr=1]\nSteps: 10%|▉ | 144/1500 [01:34<14:35, 1.55it/s, loss=0.185, lr=1]\nSteps: 10%|▉ | 145/1500 [01:34<14:41, 1.54it/s, loss=0.185, lr=1]\nSteps: 10%|▉ | 145/1500 [01:34<14:41, 1.54it/s, loss=0.163, lr=1]\nSteps: 10%|▉ | 146/1500 [01:35<14:41, 1.54it/s, loss=0.163, lr=1]\nSteps: 10%|▉ | 146/1500 [01:35<14:41, 1.54it/s, loss=0.153, lr=1]\nSteps: 10%|▉ | 147/1500 [01:36<14:39, 1.54it/s, loss=0.153, lr=1]\nSteps: 10%|▉ | 147/1500 [01:36<14:39, 1.54it/s, loss=0.331, lr=1]\nSteps: 10%|▉ | 148/1500 [01:36<14:37, 1.54it/s, loss=0.331, lr=1]\nSteps: 10%|▉ | 148/1500 [01:36<14:37, 1.54it/s, loss=0.0639, lr=1]\nSteps: 10%|▉ | 149/1500 [01:37<14:36, 1.54it/s, loss=0.0639, lr=1]\nSteps: 10%|▉ | 149/1500 [01:37<14:36, 1.54it/s, loss=0.188, lr=1] \nSteps: 10%|█ | 150/1500 [01:38<14:33, 1.54it/s, loss=0.188, lr=1]\nSteps: 10%|█ | 150/1500 [01:38<14:33, 1.54it/s, loss=0.121, lr=1]\nSteps: 10%|█ | 151/1500 [01:38<14:32, 1.55it/s, loss=0.121, lr=1]\nSteps: 10%|█ | 151/1500 [01:38<14:32, 1.55it/s, loss=0.133, lr=1]\nSteps: 10%|█ | 152/1500 [01:39<14:31, 1.55it/s, loss=0.133, lr=1]\nSteps: 10%|█ | 152/1500 [01:39<14:31, 1.55it/s, loss=0.189, lr=1]\nSteps: 10%|█ | 153/1500 [01:40<14:30, 1.55it/s, loss=0.189, lr=1]\nSteps: 10%|█ | 153/1500 [01:40<14:30, 1.55it/s, loss=0.169, lr=1]\nSteps: 10%|█ | 154/1500 [01:40<14:29, 1.55it/s, loss=0.169, lr=1]\nSteps: 10%|█ | 154/1500 [01:40<14:29, 1.55it/s, loss=0.152, lr=1]\nSteps: 10%|█ | 155/1500 [01:41<14:29, 1.55it/s, loss=0.152, lr=1]\nSteps: 10%|█ | 155/1500 [01:41<14:29, 1.55it/s, loss=0.159, lr=1]\nSteps: 10%|█ | 156/1500 [01:42<14:28, 1.55it/s, loss=0.159, lr=1]\nSteps: 10%|█ | 156/1500 [01:42<14:28, 1.55it/s, loss=0.139, lr=1]\nSteps: 10%|█ | 157/1500 [01:42<14:27, 1.55it/s, loss=0.139, lr=1]\nSteps: 10%|█ | 157/1500 [01:42<14:27, 1.55it/s, loss=0.15, lr=1] \nSteps: 11%|█ | 158/1500 [01:43<14:26, 1.55it/s, loss=0.15, lr=1]\nSteps: 11%|█ | 158/1500 [01:43<14:26, 1.55it/s, loss=0.164, lr=1]\nSteps: 11%|█ | 159/1500 [01:43<14:26, 1.55it/s, loss=0.164, lr=1]\nSteps: 11%|█ | 159/1500 [01:43<14:26, 1.55it/s, loss=0.171, lr=1]\nSteps: 11%|█ | 160/1500 [01:44<14:24, 1.55it/s, loss=0.171, lr=1]\nSteps: 11%|█ | 160/1500 [01:44<14:24, 1.55it/s, loss=0.22, lr=1] \nSteps: 11%|█ | 161/1500 [01:45<14:29, 1.54it/s, loss=0.22, lr=1]\nSteps: 11%|█ | 161/1500 [01:45<14:29, 1.54it/s, loss=0.169, lr=1]\nSteps: 11%|█ | 162/1500 [01:45<14:27, 1.54it/s, loss=0.169, lr=1]\nSteps: 11%|█ | 162/1500 [01:45<14:27, 1.54it/s, loss=0.116, lr=1]\nSteps: 11%|█ | 163/1500 [01:46<14:25, 1.54it/s, loss=0.116, lr=1]\nSteps: 11%|█ | 163/1500 [01:46<14:25, 1.54it/s, loss=0.11, lr=1] \nSteps: 11%|█ | 164/1500 [01:47<14:24, 1.55it/s, loss=0.11, lr=1]\nSteps: 11%|█ | 164/1500 [01:47<14:24, 1.55it/s, loss=0.143, lr=1]\nSteps: 11%|█ | 165/1500 [01:47<14:22, 1.55it/s, loss=0.143, lr=1]\nSteps: 11%|█ | 165/1500 [01:47<14:22, 1.55it/s, loss=0.108, lr=1]\nSteps: 11%|█ | 166/1500 [01:48<14:21, 1.55it/s, loss=0.108, lr=1]\nSteps: 11%|█ | 166/1500 [01:48<14:21, 1.55it/s, loss=0.154, lr=1]\nSteps: 11%|█ | 167/1500 [01:49<14:20, 1.55it/s, loss=0.154, lr=1]\nSteps: 11%|█ | 167/1500 [01:49<14:20, 1.55it/s, loss=0.1, lr=1] \nSteps: 11%|█ | 168/1500 [01:49<14:19, 1.55it/s, loss=0.1, lr=1]\nSteps: 11%|█ | 168/1500 [01:49<14:19, 1.55it/s, loss=0.195, lr=1]\nSteps: 11%|█▏ | 169/1500 [01:50<14:19, 1.55it/s, loss=0.195, lr=1]\nSteps: 11%|█▏ | 169/1500 [01:50<14:19, 1.55it/s, loss=0.155, lr=1]\nSteps: 11%|█▏ | 170/1500 [01:51<14:18, 1.55it/s, loss=0.155, lr=1]\nSteps: 11%|█▏ | 170/1500 [01:51<14:18, 1.55it/s, loss=0.241, lr=1]\nSteps: 11%|█▏ | 171/1500 [01:51<14:17, 1.55it/s, loss=0.241, lr=1]\nSteps: 11%|█▏ | 171/1500 [01:51<14:17, 1.55it/s, loss=0.115, lr=1]\nSteps: 11%|█▏ | 172/1500 [01:52<14:16, 1.55it/s, loss=0.115, lr=1]\nSteps: 11%|█▏ | 172/1500 [01:52<14:16, 1.55it/s, loss=0.167, lr=1]\nSteps: 12%|█▏ | 173/1500 [01:53<14:15, 1.55it/s, loss=0.167, lr=1]\nSteps: 12%|█▏ | 173/1500 [01:53<14:15, 1.55it/s, loss=0.261, lr=1]\nSteps: 12%|█▏ | 174/1500 [01:53<14:14, 1.55it/s, loss=0.261, lr=1]\nSteps: 12%|█▏ | 174/1500 [01:53<14:14, 1.55it/s, loss=0.209, lr=1]\nSteps: 12%|█▏ | 175/1500 [01:54<14:15, 1.55it/s, loss=0.209, lr=1]\nSteps: 12%|█▏ | 175/1500 [01:54<14:15, 1.55it/s, loss=0.15, lr=1] \nSteps: 12%|█▏ | 176/1500 [01:54<14:14, 1.55it/s, loss=0.15, lr=1]\nSteps: 12%|█▏ | 176/1500 [01:54<14:14, 1.55it/s, loss=0.213, lr=1]\nSteps: 12%|█▏ | 177/1500 [01:55<14:18, 1.54it/s, loss=0.213, lr=1]\nSteps: 12%|█▏ | 177/1500 [01:55<14:18, 1.54it/s, loss=0.223, lr=1]\nSteps: 12%|█▏ | 178/1500 [01:56<14:18, 1.54it/s, loss=0.223, lr=1]\nSteps: 12%|█▏ | 178/1500 [01:56<14:18, 1.54it/s, loss=0.122, lr=1]\nSteps: 12%|█▏ | 179/1500 [01:56<14:15, 1.54it/s, loss=0.122, lr=1]\nSteps: 12%|█▏ | 179/1500 [01:56<14:15, 1.54it/s, loss=0.193, lr=1]\nSteps: 12%|█▏ | 180/1500 [01:57<14:13, 1.55it/s, loss=0.193, lr=1]\nSteps: 12%|█▏ | 180/1500 [01:57<14:13, 1.55it/s, loss=0.174, lr=1]\nSteps: 12%|█▏ | 181/1500 [01:58<14:12, 1.55it/s, loss=0.174, lr=1]\nSteps: 12%|█▏ | 181/1500 [01:58<14:12, 1.55it/s, loss=0.0489, lr=1]\nSteps: 12%|█▏ | 182/1500 [01:58<14:11, 1.55it/s, loss=0.0489, lr=1]\nSteps: 12%|█▏ | 182/1500 [01:58<14:11, 1.55it/s, loss=0.179, lr=1] \nSteps: 12%|█▏ | 183/1500 [01:59<14:09, 1.55it/s, loss=0.179, lr=1]\nSteps: 12%|█▏ | 183/1500 [01:59<14:09, 1.55it/s, loss=0.111, lr=1]\nSteps: 12%|█▏ | 184/1500 [02:00<14:08, 1.55it/s, loss=0.111, lr=1]\nSteps: 12%|█▏ | 184/1500 [02:00<14:08, 1.55it/s, loss=0.166, lr=1]\nSteps: 12%|█▏ | 185/1500 [02:00<14:07, 1.55it/s, loss=0.166, lr=1]\nSteps: 12%|█▏ | 185/1500 [02:00<14:07, 1.55it/s, loss=0.094, lr=1]\nSteps: 12%|█▏ | 186/1500 [02:01<14:07, 1.55it/s, loss=0.094, lr=1]\nSteps: 12%|█▏ | 186/1500 [02:01<14:07, 1.55it/s, loss=0.18, lr=1] \nSteps: 12%|█▏ | 187/1500 [02:02<14:06, 1.55it/s, loss=0.18, lr=1]\nSteps: 12%|█▏ | 187/1500 [02:02<14:06, 1.55it/s, loss=0.0663, lr=1]\nSteps: 13%|█▎ | 188/1500 [02:02<14:06, 1.55it/s, loss=0.0663, lr=1]\nSteps: 13%|█▎ | 188/1500 [02:02<14:06, 1.55it/s, loss=0.0255, lr=1]\nSteps: 13%|█▎ | 189/1500 [02:03<14:06, 1.55it/s, loss=0.0255, lr=1]\nSteps: 13%|█▎ | 189/1500 [02:03<14:06, 1.55it/s, loss=0.218, lr=1] \nSteps: 13%|█▎ | 190/1500 [02:04<14:05, 1.55it/s, loss=0.218, lr=1]\nSteps: 13%|█▎ | 190/1500 [02:04<14:05, 1.55it/s, loss=0.0205, lr=1]\nSteps: 13%|█▎ | 191/1500 [02:04<14:03, 1.55it/s, loss=0.0205, lr=1]\nSteps: 13%|█▎ | 191/1500 [02:04<14:03, 1.55it/s, loss=0.165, lr=1] \nSteps: 13%|█▎ | 192/1500 [02:05<14:04, 1.55it/s, loss=0.165, lr=1]\nSteps: 13%|█▎ | 192/1500 [02:05<14:04, 1.55it/s, loss=0.261, lr=1]\nSteps: 13%|█▎ | 193/1500 [02:05<14:08, 1.54it/s, loss=0.261, lr=1]\nSteps: 13%|█▎ | 193/1500 [02:05<14:08, 1.54it/s, loss=0.105, lr=1]\nSteps: 13%|█▎ | 194/1500 [02:06<14:06, 1.54it/s, loss=0.105, lr=1]\nSteps: 13%|█▎ | 194/1500 [02:06<14:06, 1.54it/s, loss=0.17, lr=1] \nSteps: 13%|█▎ | 195/1500 [02:07<14:04, 1.55it/s, loss=0.17, lr=1]\nSteps: 13%|█▎ | 195/1500 [02:07<14:04, 1.55it/s, loss=0.0959, lr=1]\nSteps: 13%|█▎ | 196/1500 [02:07<14:02, 1.55it/s, loss=0.0959, lr=1]\nSteps: 13%|█▎ | 196/1500 [02:07<14:02, 1.55it/s, loss=0.129, lr=1] \nSteps: 13%|█▎ | 197/1500 [02:08<14:01, 1.55it/s, loss=0.129, lr=1]\nSteps: 13%|█▎ | 197/1500 [02:08<14:01, 1.55it/s, loss=0.155, lr=1]\nSteps: 13%|█▎ | 198/1500 [02:09<13:59, 1.55it/s, loss=0.155, lr=1]\nSteps: 13%|█▎ | 198/1500 [02:09<13:59, 1.55it/s, loss=0.0291, lr=1]\nSteps: 13%|█▎ | 199/1500 [02:09<13:59, 1.55it/s, loss=0.0291, lr=1]\nSteps: 13%|█▎ | 199/1500 [02:09<13:59, 1.55it/s, loss=0.0118, lr=1]\nSteps: 13%|█▎ | 200/1500 [02:10<13:58, 1.55it/s, loss=0.0118, lr=1]\nSteps: 13%|█▎ | 200/1500 [02:10<13:58, 1.55it/s, loss=0.241, lr=1] \nSteps: 13%|█▎ | 201/1500 [02:11<13:57, 1.55it/s, loss=0.241, lr=1]\nSteps: 13%|█▎ | 201/1500 [02:11<13:57, 1.55it/s, loss=0.0772, lr=1]\nSteps: 13%|█▎ | 202/1500 [02:11<13:56, 1.55it/s, loss=0.0772, lr=1]\nSteps: 13%|█▎ | 202/1500 [02:11<13:56, 1.55it/s, loss=0.184, lr=1] \nSteps: 14%|█▎ | 203/1500 [02:12<13:56, 1.55it/s, loss=0.184, lr=1]\nSteps: 14%|█▎ | 203/1500 [02:12<13:56, 1.55it/s, loss=0.0356, lr=1]\nSteps: 14%|█▎ | 204/1500 [02:13<13:54, 1.55it/s, loss=0.0356, lr=1]\nSteps: 14%|█▎ | 204/1500 [02:13<13:54, 1.55it/s, loss=0.114, lr=1] \nSteps: 14%|█▎ | 205/1500 [02:13<13:53, 1.55it/s, loss=0.114, lr=1]\nSteps: 14%|█▎ | 205/1500 [02:13<13:53, 1.55it/s, loss=0.133, lr=1]\nSteps: 14%|█▎ | 206/1500 [02:14<13:53, 1.55it/s, loss=0.133, lr=1]\nSteps: 14%|█▎ | 206/1500 [02:14<13:53, 1.55it/s, loss=0.327, lr=1]\nSteps: 14%|█▍ | 207/1500 [02:14<13:53, 1.55it/s, loss=0.327, lr=1]\nSteps: 14%|█▍ | 207/1500 [02:14<13:53, 1.55it/s, loss=0.0946, lr=1]\nSteps: 14%|█▍ | 208/1500 [02:15<13:52, 1.55it/s, loss=0.0946, lr=1]\nSteps: 14%|█▍ | 208/1500 [02:15<13:52, 1.55it/s, loss=0.167, lr=1] \nSteps: 14%|█▍ | 209/1500 [02:16<13:58, 1.54it/s, loss=0.167, lr=1]\nSteps: 14%|█▍ | 209/1500 [02:16<13:58, 1.54it/s, loss=0.0422, lr=1]\nSteps: 14%|█▍ | 210/1500 [02:16<13:56, 1.54it/s, loss=0.0422, lr=1]\nSteps: 14%|█▍ | 210/1500 [02:16<13:56, 1.54it/s, loss=0.117, lr=1] \nSteps: 14%|█▍ | 211/1500 [02:17<13:54, 1.54it/s, loss=0.117, lr=1]\nSteps: 14%|█▍ | 211/1500 [02:17<13:54, 1.54it/s, loss=0.0348, lr=1]\nSteps: 14%|█▍ | 212/1500 [02:18<13:53, 1.55it/s, loss=0.0348, lr=1]\nSteps: 14%|█▍ | 212/1500 [02:18<13:53, 1.55it/s, loss=0.171, lr=1] \nSteps: 14%|█▍ | 213/1500 [02:18<13:51, 1.55it/s, loss=0.171, lr=1]\nSteps: 14%|█▍ | 213/1500 [02:18<13:51, 1.55it/s, loss=0.0696, lr=1]\nSteps: 14%|█▍ | 214/1500 [02:19<13:50, 1.55it/s, loss=0.0696, lr=1]\nSteps: 14%|█▍ | 214/1500 [02:19<13:50, 1.55it/s, loss=0.0846, lr=1]\nSteps: 14%|█▍ | 215/1500 [02:20<13:48, 1.55it/s, loss=0.0846, lr=1]\nSteps: 14%|█▍ | 215/1500 [02:20<13:48, 1.55it/s, loss=0.146, lr=1] \nSteps: 14%|█▍ | 216/1500 [02:20<13:49, 1.55it/s, loss=0.146, lr=1]\nSteps: 14%|█▍ | 216/1500 [02:20<13:49, 1.55it/s, loss=0.121, lr=1]\nSteps: 14%|█▍ | 217/1500 [02:21<13:48, 1.55it/s, loss=0.121, lr=1]\nSteps: 14%|█▍ | 217/1500 [02:21<13:48, 1.55it/s, loss=0.246, lr=1]\nSteps: 15%|█▍ | 218/1500 [02:22<13:47, 1.55it/s, loss=0.246, lr=1]\nSteps: 15%|█▍ | 218/1500 [02:22<13:47, 1.55it/s, loss=0.115, lr=1]\nSteps: 15%|█▍ | 219/1500 [02:22<13:45, 1.55it/s, loss=0.115, lr=1]\nSteps: 15%|█▍ | 219/1500 [02:22<13:45, 1.55it/s, loss=0.138, lr=1]\nSteps: 15%|█▍ | 220/1500 [02:23<13:45, 1.55it/s, loss=0.138, lr=1]\nSteps: 15%|█▍ | 220/1500 [02:23<13:45, 1.55it/s, loss=0.0753, lr=1]\nSteps: 15%|█▍ | 221/1500 [02:24<13:44, 1.55it/s, loss=0.0753, lr=1]\nSteps: 15%|█▍ | 221/1500 [02:24<13:44, 1.55it/s, loss=0.167, lr=1] \nSteps: 15%|█▍ | 222/1500 [02:24<13:44, 1.55it/s, loss=0.167, lr=1]\nSteps: 15%|█▍ | 222/1500 [02:24<13:44, 1.55it/s, loss=0.162, lr=1]\nSteps: 15%|█▍ | 223/1500 [02:25<13:43, 1.55it/s, loss=0.162, lr=1]\nSteps: 15%|█▍ | 223/1500 [02:25<13:43, 1.55it/s, loss=0.18, lr=1] \nSteps: 15%|█▍ | 224/1500 [02:25<13:42, 1.55it/s, loss=0.18, lr=1]\nSteps: 15%|█▍ | 224/1500 [02:25<13:42, 1.55it/s, loss=0.0729, lr=1]\nSteps: 15%|█▌ | 225/1500 [02:26<13:45, 1.54it/s, loss=0.0729, lr=1]\nSteps: 15%|█▌ | 225/1500 [02:26<13:45, 1.54it/s, loss=0.571, lr=1] \nSteps: 15%|█▌ | 226/1500 [02:27<13:44, 1.55it/s, loss=0.571, lr=1]\nSteps: 15%|█▌ | 226/1500 [02:27<13:44, 1.55it/s, loss=0.0794, lr=1]\nSteps: 15%|█▌ | 227/1500 [02:27<13:43, 1.55it/s, loss=0.0794, lr=1]\nSteps: 15%|█▌ | 227/1500 [02:27<13:43, 1.55it/s, loss=0.0834, lr=1]\nSteps: 15%|█▌ | 228/1500 [02:28<13:42, 1.55it/s, loss=0.0834, lr=1]\nSteps: 15%|█▌ | 228/1500 [02:28<13:42, 1.55it/s, loss=0.209, lr=1] \nSteps: 15%|█▌ | 229/1500 [02:29<13:41, 1.55it/s, loss=0.209, lr=1]\nSteps: 15%|█▌ | 229/1500 [02:29<13:41, 1.55it/s, loss=0.216, lr=1]\nSteps: 15%|█▌ | 230/1500 [02:29<13:40, 1.55it/s, loss=0.216, lr=1]\nSteps: 15%|█▌ | 230/1500 [02:29<13:40, 1.55it/s, loss=0.17, lr=1] \nSteps: 15%|█▌ | 231/1500 [02:30<13:39, 1.55it/s, loss=0.17, lr=1]\nSteps: 15%|█▌ | 231/1500 [02:30<13:39, 1.55it/s, loss=0.169, lr=1]\nSteps: 15%|█▌ | 232/1500 [02:31<13:38, 1.55it/s, loss=0.169, lr=1]\nSteps: 15%|█▌ | 232/1500 [02:31<13:38, 1.55it/s, loss=0.23, lr=1] \nSteps: 16%|█▌ | 233/1500 [02:31<13:37, 1.55it/s, loss=0.23, lr=1]\nSteps: 16%|█▌ | 233/1500 [02:31<13:37, 1.55it/s, loss=0.139, lr=1]\nSteps: 16%|█▌ | 234/1500 [02:32<13:36, 1.55it/s, loss=0.139, lr=1]\nSteps: 16%|█▌ | 234/1500 [02:32<13:36, 1.55it/s, loss=0.121, lr=1]\nSteps: 16%|█▌ | 235/1500 [02:33<13:36, 1.55it/s, loss=0.121, lr=1]\nSteps: 16%|█▌ | 235/1500 [02:33<13:36, 1.55it/s, loss=0.155, lr=1]\nSteps: 16%|█▌ | 236/1500 [02:33<13:34, 1.55it/s, loss=0.155, lr=1]\nSteps: 16%|█▌ | 236/1500 [02:33<13:34, 1.55it/s, loss=0.135, lr=1]\nSteps: 16%|█▌ | 237/1500 [02:34<13:34, 1.55it/s, loss=0.135, lr=1]\nSteps: 16%|█▌ | 237/1500 [02:34<13:34, 1.55it/s, loss=0.144, lr=1]\nSteps: 16%|█▌ | 238/1500 [02:35<13:35, 1.55it/s, loss=0.144, lr=1]\nSteps: 16%|█▌ | 238/1500 [02:35<13:35, 1.55it/s, loss=0.209, lr=1]\nSteps: 16%|█▌ | 239/1500 [02:35<13:35, 1.55it/s, loss=0.209, lr=1]\nSteps: 16%|█▌ | 239/1500 [02:35<13:35, 1.55it/s, loss=0.207, lr=1]\nSteps: 16%|█▌ | 240/1500 [02:36<13:34, 1.55it/s, loss=0.207, lr=1]\nSteps: 16%|█▌ | 240/1500 [02:36<13:34, 1.55it/s, loss=0.15, lr=1] \nSteps: 16%|█▌ | 241/1500 [02:36<13:38, 1.54it/s, loss=0.15, lr=1]\nSteps: 16%|█▌ | 241/1500 [02:36<13:38, 1.54it/s, loss=0.0908, lr=1]\nSteps: 16%|█▌ | 242/1500 [02:37<13:36, 1.54it/s, loss=0.0908, lr=1]\nSteps: 16%|█▌ | 242/1500 [02:37<13:36, 1.54it/s, loss=0.254, lr=1] \nSteps: 16%|█▌ | 243/1500 [02:38<13:35, 1.54it/s, loss=0.254, lr=1]\nSteps: 16%|█▌ | 243/1500 [02:38<13:35, 1.54it/s, loss=0.108, lr=1]\nSteps: 16%|█▋ | 244/1500 [02:38<13:32, 1.54it/s, loss=0.108, lr=1]\nSteps: 16%|█▋ | 244/1500 [02:38<13:32, 1.54it/s, loss=0.167, lr=1]\nSteps: 16%|█▋ | 245/1500 [02:39<13:30, 1.55it/s, loss=0.167, lr=1]\nSteps: 16%|█▋ | 245/1500 [02:39<13:30, 1.55it/s, loss=0.127, lr=1]\nSteps: 16%|█▋ | 246/1500 [02:40<13:29, 1.55it/s, loss=0.127, lr=1]\nSteps: 16%|█▋ | 246/1500 [02:40<13:29, 1.55it/s, loss=0.0929, lr=1]\nSteps: 16%|█▋ | 247/1500 [02:40<13:29, 1.55it/s, loss=0.0929, lr=1]\nSteps: 16%|█▋ | 247/1500 [02:40<13:29, 1.55it/s, loss=0.125, lr=1] \nSteps: 17%|█▋ | 248/1500 [02:41<13:28, 1.55it/s, loss=0.125, lr=1]\nSteps: 17%|█▋ | 248/1500 [02:41<13:28, 1.55it/s, loss=0.161, lr=1]\nSteps: 17%|█▋ | 249/1500 [02:42<13:26, 1.55it/s, loss=0.161, lr=1]\nSteps: 17%|█▋ | 249/1500 [02:42<13:26, 1.55it/s, loss=0.138, lr=1]\nSteps: 17%|█▋ | 250/1500 [02:42<13:26, 1.55it/s, loss=0.138, lr=1]\nSteps: 17%|█▋ | 250/1500 [02:42<13:26, 1.55it/s, loss=0.322, lr=1]\nSteps: 17%|█▋ | 251/1500 [02:43<13:26, 1.55it/s, loss=0.322, lr=1]\nSteps: 17%|█▋ | 251/1500 [02:43<13:26, 1.55it/s, loss=0.136, lr=1]\nSteps: 17%|█▋ | 252/1500 [02:44<13:24, 1.55it/s, loss=0.136, lr=1]\nSteps: 17%|█▋ | 252/1500 [02:44<13:24, 1.55it/s, loss=0.121, lr=1]\nSteps: 17%|█▋ | 253/1500 [02:44<13:24, 1.55it/s, loss=0.121, lr=1]\nSteps: 17%|█▋ | 253/1500 [02:44<13:24, 1.55it/s, loss=0.148, lr=1]\nSteps: 17%|█▋ | 254/1500 [02:45<13:23, 1.55it/s, loss=0.148, lr=1]\nSteps: 17%|█▋ | 254/1500 [02:45<13:23, 1.55it/s, loss=0.166, lr=1]\nSteps: 17%|█▋ | 255/1500 [02:45<13:22, 1.55it/s, loss=0.166, lr=1]\nSteps: 17%|█▋ | 255/1500 [02:45<13:22, 1.55it/s, loss=0.148, lr=1]\nSteps: 17%|█▋ | 256/1500 [02:46<13:20, 1.55it/s, loss=0.148, lr=1]\nSteps: 17%|█▋ | 256/1500 [02:46<13:20, 1.55it/s, loss=0.0994, lr=1]\nSteps: 17%|█▋ | 257/1500 [02:47<13:26, 1.54it/s, loss=0.0994, lr=1]\nSteps: 17%|█▋ | 257/1500 [02:47<13:26, 1.54it/s, loss=0.0712, lr=1]\nSteps: 17%|█▋ | 258/1500 [02:47<13:24, 1.54it/s, loss=0.0712, lr=1]\nSteps: 17%|█▋ | 258/1500 [02:47<13:24, 1.54it/s, loss=0.203, lr=1] \nSteps: 17%|█▋ | 259/1500 [02:48<13:21, 1.55it/s, loss=0.203, lr=1]\nSteps: 17%|█▋ | 259/1500 [02:48<13:21, 1.55it/s, loss=0.141, lr=1]\nSteps: 17%|█▋ | 260/1500 [02:49<13:20, 1.55it/s, loss=0.141, lr=1]\nSteps: 17%|█▋ | 260/1500 [02:49<13:20, 1.55it/s, loss=0.0987, lr=1]\nSteps: 17%|█▋ | 261/1500 [02:49<13:19, 1.55it/s, loss=0.0987, lr=1]\nSteps: 17%|█▋ | 261/1500 [02:49<13:19, 1.55it/s, loss=0.128, lr=1] \nSteps: 17%|█▋ | 262/1500 [02:50<13:18, 1.55it/s, loss=0.128, lr=1]\nSteps: 17%|█▋ | 262/1500 [02:50<13:18, 1.55it/s, loss=0.18, lr=1] \nSteps: 18%|█▊ | 263/1500 [02:51<13:17, 1.55it/s, loss=0.18, lr=1]\nSteps: 18%|█▊ | 263/1500 [02:51<13:17, 1.55it/s, loss=0.119, lr=1]\nSteps: 18%|█▊ | 264/1500 [02:51<13:15, 1.55it/s, loss=0.119, lr=1]\nSteps: 18%|█▊ | 264/1500 [02:51<13:15, 1.55it/s, loss=0.12, lr=1] \nSteps: 18%|█▊ | 265/1500 [02:52<13:14, 1.55it/s, loss=0.12, lr=1]\nSteps: 18%|█▊ | 265/1500 [02:52<13:14, 1.55it/s, loss=0.0896, lr=1]\nSteps: 18%|█▊ | 266/1500 [02:53<13:14, 1.55it/s, loss=0.0896, lr=1]\nSteps: 18%|█▊ | 266/1500 [02:53<13:14, 1.55it/s, loss=0.169, lr=1] \nSteps: 18%|█▊ | 267/1500 [02:53<13:14, 1.55it/s, loss=0.169, lr=1]\nSteps: 18%|█▊ | 267/1500 [02:53<13:14, 1.55it/s, loss=0.0474, lr=1]\nSteps: 18%|█▊ | 268/1500 [02:54<13:17, 1.55it/s, loss=0.0474, lr=1]\nSteps: 18%|█▊ | 268/1500 [02:54<13:17, 1.55it/s, loss=0.231, lr=1] \nSteps: 18%|█▊ | 269/1500 [02:55<13:16, 1.55it/s, loss=0.231, lr=1]\nSteps: 18%|█▊ | 269/1500 [02:55<13:16, 1.55it/s, loss=0.134, lr=1]\nSteps: 18%|█▊ | 270/1500 [02:55<13:15, 1.55it/s, loss=0.134, lr=1]\nSteps: 18%|█▊ | 270/1500 [02:55<13:15, 1.55it/s, loss=0.196, lr=1]\nSteps: 18%|█▊ | 271/1500 [02:56<13:15, 1.54it/s, loss=0.196, lr=1]\nSteps: 18%|█▊ | 271/1500 [02:56<13:15, 1.54it/s, loss=0.0384, lr=1]\nSteps: 18%|█▊ | 272/1500 [02:56<13:13, 1.55it/s, loss=0.0384, lr=1]\nSteps: 18%|█▊ | 272/1500 [02:56<13:13, 1.55it/s, loss=0.157, lr=1] \nSteps: 18%|█▊ | 273/1500 [02:57<13:17, 1.54it/s, loss=0.157, lr=1]\nSteps: 18%|█▊ | 273/1500 [02:57<13:17, 1.54it/s, loss=0.135, lr=1]\nSteps: 18%|█▊ | 274/1500 [02:58<13:14, 1.54it/s, loss=0.135, lr=1]\nSteps: 18%|█▊ | 274/1500 [02:58<13:14, 1.54it/s, loss=0.149, lr=1]\nSteps: 18%|█▊ | 275/1500 [02:58<13:12, 1.55it/s, loss=0.149, lr=1]\nSteps: 18%|█▊ | 275/1500 [02:58<13:12, 1.55it/s, loss=0.138, lr=1]\nSteps: 18%|█▊ | 276/1500 [02:59<13:12, 1.55it/s, loss=0.138, lr=1]\nSteps: 18%|█▊ | 276/1500 [02:59<13:12, 1.55it/s, loss=0.304, lr=1]\nSteps: 18%|█▊ | 277/1500 [03:00<13:10, 1.55it/s, loss=0.304, lr=1]\nSteps: 18%|█▊ | 277/1500 [03:00<13:10, 1.55it/s, loss=0.189, lr=1]\nSteps: 19%|█▊ | 278/1500 [03:00<13:09, 1.55it/s, loss=0.189, lr=1]\nSteps: 19%|█▊ | 278/1500 [03:00<13:09, 1.55it/s, loss=0.256, lr=1]\nSteps: 19%|█▊ | 279/1500 [03:01<13:08, 1.55it/s, loss=0.256, lr=1]\nSteps: 19%|█▊ | 279/1500 [03:01<13:08, 1.55it/s, loss=0.0318, lr=1]\nSteps: 19%|█▊ | 280/1500 [03:02<13:07, 1.55it/s, loss=0.0318, lr=1]\nSteps: 19%|█▊ | 280/1500 [03:02<13:07, 1.55it/s, loss=0.0892, lr=1]\nSteps: 19%|█▊ | 281/1500 [03:02<13:05, 1.55it/s, loss=0.0892, lr=1]\nSteps: 19%|█▊ | 281/1500 [03:02<13:05, 1.55it/s, loss=0.191, lr=1] \nSteps: 19%|█▉ | 282/1500 [03:03<13:05, 1.55it/s, loss=0.191, lr=1]\nSteps: 19%|█▉ | 282/1500 [03:03<13:05, 1.55it/s, loss=0.132, lr=1]\nSteps: 19%|█▉ | 283/1500 [03:04<13:04, 1.55it/s, loss=0.132, lr=1]\nSteps: 19%|█▉ | 283/1500 [03:04<13:04, 1.55it/s, loss=0.12, lr=1] \nSteps: 19%|█▉ | 284/1500 [03:04<13:05, 1.55it/s, loss=0.12, lr=1]\nSteps: 19%|█▉ | 284/1500 [03:04<13:05, 1.55it/s, loss=0.18, lr=1]\nSteps: 19%|█▉ | 285/1500 [03:05<13:05, 1.55it/s, loss=0.18, lr=1]\nSteps: 19%|█▉ | 285/1500 [03:05<13:05, 1.55it/s, loss=0.109, lr=1]\nSteps: 19%|█▉ | 286/1500 [03:06<13:03, 1.55it/s, loss=0.109, lr=1]\nSteps: 19%|█▉ | 286/1500 [03:06<13:03, 1.55it/s, loss=0.0822, lr=1]\nSteps: 19%|█▉ | 287/1500 [03:06<13:02, 1.55it/s, loss=0.0822, lr=1]\nSteps: 19%|█▉ | 287/1500 [03:06<13:02, 1.55it/s, loss=0.0585, lr=1]\nSteps: 19%|█▉ | 288/1500 [03:07<13:01, 1.55it/s, loss=0.0585, lr=1]\nSteps: 19%|█▉ | 288/1500 [03:07<13:01, 1.55it/s, loss=0.266, lr=1] \nSteps: 19%|█▉ | 289/1500 [03:07<13:05, 1.54it/s, loss=0.266, lr=1]\nSteps: 19%|█▉ | 289/1500 [03:07<13:05, 1.54it/s, loss=0.141, lr=1]\nSteps: 19%|█▉ | 290/1500 [03:08<13:03, 1.55it/s, loss=0.141, lr=1]\nSteps: 19%|█▉ | 290/1500 [03:08<13:03, 1.55it/s, loss=0.2, lr=1] \nSteps: 19%|█▉ | 291/1500 [03:09<13:01, 1.55it/s, loss=0.2, lr=1]\nSteps: 19%|█▉ | 291/1500 [03:09<13:01, 1.55it/s, loss=0.271, lr=1]\nSteps: 19%|█▉ | 292/1500 [03:09<13:00, 1.55it/s, loss=0.271, lr=1]\nSteps: 19%|█▉ | 292/1500 [03:09<13:00, 1.55it/s, loss=0.197, lr=1]\nSteps: 20%|█▉ | 293/1500 [03:10<12:59, 1.55it/s, loss=0.197, lr=1]\nSteps: 20%|█▉ | 293/1500 [03:10<12:59, 1.55it/s, loss=0.171, lr=1]\nSteps: 20%|█▉ | 294/1500 [03:11<12:58, 1.55it/s, loss=0.171, lr=1]\nSteps: 20%|█▉ | 294/1500 [03:11<12:58, 1.55it/s, loss=0.0934, lr=1]\nSteps: 20%|█▉ | 295/1500 [03:11<12:57, 1.55it/s, loss=0.0934, lr=1]\nSteps: 20%|█▉ | 295/1500 [03:11<12:57, 1.55it/s, loss=0.0422, lr=1]\nSteps: 20%|█▉ | 296/1500 [03:12<12:56, 1.55it/s, loss=0.0422, lr=1]\nSteps: 20%|█▉ | 296/1500 [03:12<12:56, 1.55it/s, loss=0.29, lr=1] \nSteps: 20%|█▉ | 297/1500 [03:13<12:55, 1.55it/s, loss=0.29, lr=1]\nSteps: 20%|█▉ | 297/1500 [03:13<12:55, 1.55it/s, loss=0.237, lr=1]\nSteps: 20%|█▉ | 298/1500 [03:13<12:54, 1.55it/s, loss=0.237, lr=1]\nSteps: 20%|█▉ | 298/1500 [03:13<12:54, 1.55it/s, loss=0.142, lr=1]\nSteps: 20%|█▉ | 299/1500 [03:14<12:54, 1.55it/s, loss=0.142, lr=1]\nSteps: 20%|█▉ | 299/1500 [03:14<12:54, 1.55it/s, loss=0.0447, lr=1]\nSteps: 20%|██ | 300/1500 [03:15<12:53, 1.55it/s, loss=0.0447, lr=1]\nSteps: 20%|██ | 300/1500 [03:15<12:53, 1.55it/s, loss=0.112, lr=1] \nSteps: 20%|██ | 301/1500 [03:15<12:52, 1.55it/s, loss=0.112, lr=1]\nSteps: 20%|██ | 301/1500 [03:15<12:52, 1.55it/s, loss=0.0553, lr=1]\nSteps: 20%|██ | 302/1500 [03:16<12:53, 1.55it/s, loss=0.0553, lr=1]\nSteps: 20%|██ | 302/1500 [03:16<12:53, 1.55it/s, loss=0.0361, lr=1]\nSteps: 20%|██ | 303/1500 [03:16<12:52, 1.55it/s, loss=0.0361, lr=1]\nSteps: 20%|██ | 303/1500 [03:16<12:52, 1.55it/s, loss=0.0686, lr=1]\nSteps: 20%|██ | 304/1500 [03:17<12:50, 1.55it/s, loss=0.0686, lr=1]\nSteps: 20%|██ | 304/1500 [03:17<12:50, 1.55it/s, loss=0.0536, lr=1]\nSteps: 20%|██ | 305/1500 [03:18<12:54, 1.54it/s, loss=0.0536, lr=1]\nSteps: 20%|██ | 305/1500 [03:18<12:54, 1.54it/s, loss=0.14, lr=1] \nSteps: 20%|██ | 306/1500 [03:18<12:53, 1.54it/s, loss=0.14, lr=1]\nSteps: 20%|██ | 306/1500 [03:18<12:53, 1.54it/s, loss=0.144, lr=1]\nSteps: 20%|██ | 307/1500 [03:19<12:51, 1.55it/s, loss=0.144, lr=1]\nSteps: 20%|██ | 307/1500 [03:19<12:51, 1.55it/s, loss=0.102, lr=1]\nSteps: 21%|██ | 308/1500 [03:20<12:51, 1.54it/s, loss=0.102, lr=1]\nSteps: 21%|██ | 308/1500 [03:20<12:51, 1.54it/s, loss=0.236, lr=1]\nSteps: 21%|██ | 309/1500 [03:20<12:50, 1.54it/s, loss=0.236, lr=1]\nSteps: 21%|██ | 309/1500 [03:20<12:50, 1.54it/s, loss=0.0863, lr=1]\nSteps: 21%|██ | 310/1500 [03:21<12:49, 1.55it/s, loss=0.0863, lr=1]\nSteps: 21%|██ | 310/1500 [03:21<12:49, 1.55it/s, loss=0.11, lr=1] \nSteps: 21%|██ | 311/1500 [03:22<12:47, 1.55it/s, loss=0.11, lr=1]\nSteps: 21%|██ | 311/1500 [03:22<12:47, 1.55it/s, loss=0.263, lr=1]\nSteps: 21%|██ | 312/1500 [03:22<12:46, 1.55it/s, loss=0.263, lr=1]\nSteps: 21%|██ | 312/1500 [03:22<12:46, 1.55it/s, loss=0.232, lr=1]\nSteps: 21%|██ | 313/1500 [03:23<12:46, 1.55it/s, loss=0.232, lr=1]\nSteps: 21%|██ | 313/1500 [03:23<12:46, 1.55it/s, loss=0.126, lr=1]\nSteps: 21%|██ | 314/1500 [03:24<12:44, 1.55it/s, loss=0.126, lr=1]\nSteps: 21%|██ | 314/1500 [03:24<12:44, 1.55it/s, loss=0.0702, lr=1]\nSteps: 21%|██ | 315/1500 [03:24<12:44, 1.55it/s, loss=0.0702, lr=1]\nSteps: 21%|██ | 315/1500 [03:24<12:44, 1.55it/s, loss=0.125, lr=1] \nSteps: 21%|██ | 316/1500 [03:25<12:43, 1.55it/s, loss=0.125, lr=1]\nSteps: 21%|██ | 316/1500 [03:25<12:43, 1.55it/s, loss=0.131, lr=1]\nSteps: 21%|██ | 317/1500 [03:26<12:42, 1.55it/s, loss=0.131, lr=1]\nSteps: 21%|██ | 317/1500 [03:26<12:42, 1.55it/s, loss=0.0772, lr=1]\nSteps: 21%|██ | 318/1500 [03:26<12:43, 1.55it/s, loss=0.0772, lr=1]\nSteps: 21%|██ | 318/1500 [03:26<12:43, 1.55it/s, loss=0.294, lr=1] \nSteps: 21%|██▏ | 319/1500 [03:27<12:43, 1.55it/s, loss=0.294, lr=1]\nSteps: 21%|██▏ | 319/1500 [03:27<12:43, 1.55it/s, loss=0.0634, lr=1]\nSteps: 21%|██▏ | 320/1500 [03:27<12:43, 1.54it/s, loss=0.0634, lr=1]\nSteps: 21%|██▏ | 320/1500 [03:27<12:43, 1.54it/s, loss=0.0562, lr=1]\nSteps: 21%|██▏ | 321/1500 [03:28<12:48, 1.54it/s, loss=0.0562, lr=1]\nSteps: 21%|██▏ | 321/1500 [03:28<12:48, 1.54it/s, loss=0.169, lr=1] \nSteps: 21%|██▏ | 322/1500 [03:29<12:46, 1.54it/s, loss=0.169, lr=1]\nSteps: 21%|██▏ | 322/1500 [03:29<12:46, 1.54it/s, loss=0.262, lr=1]\nSteps: 22%|██▏ | 323/1500 [03:29<12:44, 1.54it/s, loss=0.262, lr=1]\nSteps: 22%|██▏ | 323/1500 [03:29<12:44, 1.54it/s, loss=0.128, lr=1]\nSteps: 22%|██▏ | 324/1500 [03:30<12:42, 1.54it/s, loss=0.128, lr=1]\nSteps: 22%|██▏ | 324/1500 [03:30<12:42, 1.54it/s, loss=0.0749, lr=1]\nSteps: 22%|██▏ | 325/1500 [03:31<12:41, 1.54it/s, loss=0.0749, lr=1]\nSteps: 22%|██▏ | 325/1500 [03:31<12:41, 1.54it/s, loss=0.141, lr=1] \nSteps: 22%|██▏ | 326/1500 [03:31<12:43, 1.54it/s, loss=0.141, lr=1]\nSteps: 22%|██▏ | 326/1500 [03:31<12:43, 1.54it/s, loss=0.0817, lr=1]\nSteps: 22%|██▏ | 327/1500 [03:32<12:41, 1.54it/s, loss=0.0817, lr=1]\nSteps: 22%|██▏ | 327/1500 [03:32<12:41, 1.54it/s, loss=0.128, lr=1] \nSteps: 22%|██▏ | 328/1500 [03:33<12:39, 1.54it/s, loss=0.128, lr=1]\nSteps: 22%|██▏ | 328/1500 [03:33<12:39, 1.54it/s, loss=0.0994, lr=1]\nSteps: 22%|██▏ | 329/1500 [03:33<12:38, 1.54it/s, loss=0.0994, lr=1]\nSteps: 22%|██▏ | 329/1500 [03:33<12:38, 1.54it/s, loss=0.192, lr=1] \nSteps: 22%|██▏ | 330/1500 [03:34<12:38, 1.54it/s, loss=0.192, lr=1]\nSteps: 22%|██▏ | 330/1500 [03:34<12:38, 1.54it/s, loss=0.0226, lr=1]\nSteps: 22%|██▏ | 331/1500 [03:35<12:37, 1.54it/s, loss=0.0226, lr=1]\nSteps: 22%|██▏ | 331/1500 [03:35<12:37, 1.54it/s, loss=0.143, lr=1] \nSteps: 22%|██▏ | 332/1500 [03:35<12:36, 1.54it/s, loss=0.143, lr=1]\nSteps: 22%|██▏ | 332/1500 [03:35<12:36, 1.54it/s, loss=0.099, lr=1]\nSteps: 22%|██▏ | 333/1500 [03:36<12:36, 1.54it/s, loss=0.099, lr=1]\nSteps: 22%|██▏ | 333/1500 [03:36<12:36, 1.54it/s, loss=0.089, lr=1]\nSteps: 22%|██▏ | 334/1500 [03:37<12:34, 1.54it/s, loss=0.089, lr=1]\nSteps: 22%|██▏ | 334/1500 [03:37<12:34, 1.54it/s, loss=0.148, lr=1]\nSteps: 22%|██▏ | 335/1500 [03:37<12:34, 1.54it/s, loss=0.148, lr=1]\nSteps: 22%|██▏ | 335/1500 [03:37<12:34, 1.54it/s, loss=0.076, lr=1]\nSteps: 22%|██▏ | 336/1500 [03:38<12:32, 1.55it/s, loss=0.076, lr=1]\nSteps: 22%|██▏ | 336/1500 [03:38<12:32, 1.55it/s, loss=0.0491, lr=1]\nSteps: 22%|██▏ | 337/1500 [03:39<12:36, 1.54it/s, loss=0.0491, lr=1]\nSteps: 22%|██▏ | 337/1500 [03:39<12:36, 1.54it/s, loss=0.143, lr=1] \nSteps: 23%|██▎ | 338/1500 [03:39<12:34, 1.54it/s, loss=0.143, lr=1]\nSteps: 23%|██▎ | 338/1500 [03:39<12:34, 1.54it/s, loss=0.111, lr=1]\nSteps: 23%|██▎ | 339/1500 [03:40<12:33, 1.54it/s, loss=0.111, lr=1]\nSteps: 23%|██▎ | 339/1500 [03:40<12:33, 1.54it/s, loss=0.141, lr=1]\nSteps: 23%|██▎ | 340/1500 [03:40<12:32, 1.54it/s, loss=0.141, lr=1]\nSteps: 23%|██▎ | 340/1500 [03:40<12:32, 1.54it/s, loss=0.0985, lr=1]\nSteps: 23%|██▎ | 341/1500 [03:41<12:31, 1.54it/s, loss=0.0985, lr=1]\nSteps: 23%|██▎ | 341/1500 [03:41<12:31, 1.54it/s, loss=0.035, lr=1] \nSteps: 23%|██▎ | 342/1500 [03:42<12:30, 1.54it/s, loss=0.035, lr=1]\nSteps: 23%|██▎ | 342/1500 [03:42<12:30, 1.54it/s, loss=0.171, lr=1]\nSteps: 23%|██▎ | 343/1500 [03:42<12:30, 1.54it/s, loss=0.171, lr=1]\nSteps: 23%|██▎ | 343/1500 [03:42<12:30, 1.54it/s, loss=0.0746, lr=1]\nSteps: 23%|██▎ | 344/1500 [03:43<12:28, 1.54it/s, loss=0.0746, lr=1]\nSteps: 23%|██▎ | 344/1500 [03:43<12:28, 1.54it/s, loss=0.029, lr=1] \nSteps: 23%|██▎ | 345/1500 [03:44<12:29, 1.54it/s, loss=0.029, lr=1]\nSteps: 23%|██▎ | 345/1500 [03:44<12:29, 1.54it/s, loss=0.127, lr=1]\nSteps: 23%|██▎ | 346/1500 [03:44<12:33, 1.53it/s, loss=0.127, lr=1]\nSteps: 23%|██▎ | 346/1500 [03:44<12:33, 1.53it/s, loss=0.242, lr=1]\nSteps: 23%|██▎ | 347/1500 [03:45<12:31, 1.53it/s, loss=0.242, lr=1]\nSteps: 23%|██▎ | 347/1500 [03:45<12:31, 1.53it/s, loss=0.217, lr=1]\nSteps: 23%|██▎ | 348/1500 [03:46<12:28, 1.54it/s, loss=0.217, lr=1]\nSteps: 23%|██▎ | 348/1500 [03:46<12:28, 1.54it/s, loss=0.155, lr=1]\nSteps: 23%|██▎ | 349/1500 [03:46<12:27, 1.54it/s, loss=0.155, lr=1]\nSteps: 23%|██▎ | 349/1500 [03:46<12:27, 1.54it/s, loss=0.0558, lr=1]\nSteps: 23%|██▎ | 350/1500 [03:47<12:25, 1.54it/s, loss=0.0558, lr=1]\nSteps: 23%|██▎ | 350/1500 [03:47<12:25, 1.54it/s, loss=0.328, lr=1] \nSteps: 23%|██▎ | 351/1500 [03:48<12:22, 1.55it/s, loss=0.328, lr=1]\nSteps: 23%|██▎ | 351/1500 [03:48<12:22, 1.55it/s, loss=0.18, lr=1] \nSteps: 23%|██▎ | 352/1500 [03:48<12:21, 1.55it/s, loss=0.18, lr=1]\nSteps: 23%|██▎ | 352/1500 [03:48<12:21, 1.55it/s, loss=0.11, lr=1]\nSteps: 24%|██▎ | 353/1500 [03:49<12:24, 1.54it/s, loss=0.11, lr=1]\nSteps: 24%|██▎ | 353/1500 [03:49<12:24, 1.54it/s, loss=0.0542, lr=1]\nSteps: 24%|██▎ | 354/1500 [03:50<12:22, 1.54it/s, loss=0.0542, lr=1]\nSteps: 24%|██▎ | 354/1500 [03:50<12:22, 1.54it/s, loss=0.166, lr=1] \nSteps: 24%|██▎ | 355/1500 [03:50<12:21, 1.54it/s, loss=0.166, lr=1]\nSteps: 24%|██▎ | 355/1500 [03:50<12:21, 1.54it/s, loss=0.0277, lr=1]\nSteps: 24%|██▎ | 356/1500 [03:51<12:21, 1.54it/s, loss=0.0277, lr=1]\nSteps: 24%|██▎ | 356/1500 [03:51<12:21, 1.54it/s, loss=0.149, lr=1] \nSteps: 24%|██▍ | 357/1500 [03:51<12:19, 1.55it/s, loss=0.149, lr=1]\nSteps: 24%|██▍ | 357/1500 [03:51<12:19, 1.55it/s, loss=0.023, lr=1]\nSteps: 24%|██▍ | 358/1500 [03:52<12:17, 1.55it/s, loss=0.023, lr=1]\nSteps: 24%|██▍ | 358/1500 [03:52<12:17, 1.55it/s, loss=0.143, lr=1]\nSteps: 24%|██▍ | 359/1500 [03:53<12:17, 1.55it/s, loss=0.143, lr=1]\nSteps: 24%|██▍ | 359/1500 [03:53<12:17, 1.55it/s, loss=0.11, lr=1] \nSteps: 24%|██▍ | 360/1500 [03:53<12:16, 1.55it/s, loss=0.11, lr=1]\nSteps: 24%|██▍ | 360/1500 [03:53<12:16, 1.55it/s, loss=0.45, lr=1]\nSteps: 24%|██▍ | 361/1500 [03:54<12:15, 1.55it/s, loss=0.45, lr=1]\nSteps: 24%|██▍ | 361/1500 [03:54<12:15, 1.55it/s, loss=0.159, lr=1]\nSteps: 24%|██▍ | 362/1500 [03:55<12:15, 1.55it/s, loss=0.159, lr=1]\nSteps: 24%|██▍ | 362/1500 [03:55<12:15, 1.55it/s, loss=0.218, lr=1]\nSteps: 24%|██▍ | 363/1500 [03:55<12:14, 1.55it/s, loss=0.218, lr=1]\nSteps: 24%|██▍ | 363/1500 [03:55<12:14, 1.55it/s, loss=0.228, lr=1]\nSteps: 24%|██▍ | 364/1500 [03:56<12:15, 1.54it/s, loss=0.228, lr=1]\nSteps: 24%|██▍ | 364/1500 [03:56<12:15, 1.54it/s, loss=0.164, lr=1]\nSteps: 24%|██▍ | 365/1500 [03:57<12:13, 1.55it/s, loss=0.164, lr=1]\nSteps: 24%|██▍ | 365/1500 [03:57<12:13, 1.55it/s, loss=0.151, lr=1]\nSteps: 24%|██▍ | 366/1500 [03:57<12:13, 1.55it/s, loss=0.151, lr=1]\nSteps: 24%|██▍ | 366/1500 [03:57<12:13, 1.55it/s, loss=0.188, lr=1]\nSteps: 24%|██▍ | 367/1500 [03:58<12:14, 1.54it/s, loss=0.188, lr=1]\nSteps: 24%|██▍ | 367/1500 [03:58<12:14, 1.54it/s, loss=0.0827, lr=1]\nSteps: 25%|██▍ | 368/1500 [03:59<12:12, 1.55it/s, loss=0.0827, lr=1]\nSteps: 25%|██▍ | 368/1500 [03:59<12:12, 1.55it/s, loss=0.0621, lr=1]\nSteps: 25%|██▍ | 369/1500 [03:59<12:15, 1.54it/s, loss=0.0621, lr=1]\nSteps: 25%|██▍ | 369/1500 [03:59<12:15, 1.54it/s, loss=0.151, lr=1] \nSteps: 25%|██▍ | 370/1500 [04:00<12:14, 1.54it/s, loss=0.151, lr=1]\nSteps: 25%|██▍ | 370/1500 [04:00<12:14, 1.54it/s, loss=0.256, lr=1]\nSteps: 25%|██▍ | 371/1500 [04:01<12:11, 1.54it/s, loss=0.256, lr=1]\nSteps: 25%|██▍ | 371/1500 [04:01<12:11, 1.54it/s, loss=0.176, lr=1]\nSteps: 25%|██▍ | 372/1500 [04:01<12:09, 1.55it/s, loss=0.176, lr=1]\nSteps: 25%|██▍ | 372/1500 [04:01<12:09, 1.55it/s, loss=0.113, lr=1]\nSteps: 25%|██▍ | 373/1500 [04:02<12:09, 1.54it/s, loss=0.113, lr=1]\nSteps: 25%|██▍ | 373/1500 [04:02<12:09, 1.54it/s, loss=0.308, lr=1]\nSteps: 25%|██▍ | 374/1500 [04:02<12:09, 1.54it/s, loss=0.308, lr=1]\nSteps: 25%|██▍ | 374/1500 [04:02<12:09, 1.54it/s, loss=0.155, lr=1]\nSteps: 25%|██▌ | 375/1500 [04:03<12:07, 1.55it/s, loss=0.155, lr=1]\nSteps: 25%|██▌ | 375/1500 [04:03<12:07, 1.55it/s, loss=0.259, lr=1]\nSteps: 25%|██▌ | 376/1500 [04:04<12:08, 1.54it/s, loss=0.259, lr=1]\nSteps: 25%|██▌ | 376/1500 [04:04<12:08, 1.54it/s, loss=0.0988, lr=1]\nSteps: 25%|██▌ | 377/1500 [04:04<12:07, 1.54it/s, loss=0.0988, lr=1]\nSteps: 25%|██▌ | 377/1500 [04:04<12:07, 1.54it/s, loss=0.135, lr=1] \nSteps: 25%|██▌ | 378/1500 [04:05<12:06, 1.55it/s, loss=0.135, lr=1]\nSteps: 25%|██▌ | 378/1500 [04:05<12:06, 1.55it/s, loss=0.295, lr=1]\nSteps: 25%|██▌ | 379/1500 [04:06<12:05, 1.54it/s, loss=0.295, lr=1]\nSteps: 25%|██▌ | 379/1500 [04:06<12:05, 1.54it/s, loss=0.191, lr=1]\nSteps: 25%|██▌ | 380/1500 [04:06<12:04, 1.55it/s, loss=0.191, lr=1]\nSteps: 25%|██▌ | 380/1500 [04:06<12:04, 1.55it/s, loss=0.0877, lr=1]\nSteps: 25%|██▌ | 381/1500 [04:07<12:02, 1.55it/s, loss=0.0877, lr=1]\nSteps: 25%|██▌ | 381/1500 [04:07<12:02, 1.55it/s, loss=0.118, lr=1] \nSteps: 25%|██▌ | 382/1500 [04:08<12:01, 1.55it/s, loss=0.118, lr=1]\nSteps: 25%|██▌ | 382/1500 [04:08<12:01, 1.55it/s, loss=0.215, lr=1]\nSteps: 26%|██▌ | 383/1500 [04:08<12:01, 1.55it/s, loss=0.215, lr=1]\nSteps: 26%|██▌ | 383/1500 [04:08<12:01, 1.55it/s, loss=0.157, lr=1]\nSteps: 26%|██▌ | 384/1500 [04:09<12:00, 1.55it/s, loss=0.157, lr=1]\nSteps: 26%|██▌ | 384/1500 [04:09<12:00, 1.55it/s, loss=0.177, lr=1]\nSteps: 26%|██▌ | 385/1500 [04:10<12:03, 1.54it/s, loss=0.177, lr=1]\nSteps: 26%|██▌ | 385/1500 [04:10<12:03, 1.54it/s, loss=0.204, lr=1]\nSteps: 26%|██▌ | 386/1500 [04:10<12:01, 1.54it/s, loss=0.204, lr=1]\nSteps: 26%|██▌ | 386/1500 [04:10<12:01, 1.54it/s, loss=0.131, lr=1]\nSteps: 26%|██▌ | 387/1500 [04:11<11:59, 1.55it/s, loss=0.131, lr=1]\nSteps: 26%|██▌ | 387/1500 [04:11<11:59, 1.55it/s, loss=0.184, lr=1]\nSteps: 26%|██▌ | 388/1500 [04:12<11:57, 1.55it/s, loss=0.184, lr=1]\nSteps: 26%|██▌ | 388/1500 [04:12<11:57, 1.55it/s, loss=0.0585, lr=1]\nSteps: 26%|██▌ | 389/1500 [04:12<11:56, 1.55it/s, loss=0.0585, lr=1]\nSteps: 26%|██▌ | 389/1500 [04:12<11:56, 1.55it/s, loss=0.182, lr=1] \nSteps: 26%|██▌ | 390/1500 [04:13<11:56, 1.55it/s, loss=0.182, lr=1]\nSteps: 26%|██▌ | 390/1500 [04:13<11:56, 1.55it/s, loss=0.0418, lr=1]\nSteps: 26%|██▌ | 391/1500 [04:13<11:54, 1.55it/s, loss=0.0418, lr=1]\nSteps: 26%|██▌ | 391/1500 [04:13<11:54, 1.55it/s, loss=0.104, lr=1] \nSteps: 26%|██▌ | 392/1500 [04:14<11:53, 1.55it/s, loss=0.104, lr=1]\nSteps: 26%|██▌ | 392/1500 [04:14<11:53, 1.55it/s, loss=0.0842, lr=1]\nSteps: 26%|██▌ | 393/1500 [04:15<11:52, 1.55it/s, loss=0.0842, lr=1]\nSteps: 26%|██▌ | 393/1500 [04:15<11:52, 1.55it/s, loss=0.0876, lr=1]\nSteps: 26%|██▋ | 394/1500 [04:15<11:52, 1.55it/s, loss=0.0876, lr=1]\nSteps: 26%|██▋ | 394/1500 [04:15<11:52, 1.55it/s, loss=0.183, lr=1] \nSteps: 26%|██▋ | 395/1500 [04:16<11:52, 1.55it/s, loss=0.183, lr=1]\nSteps: 26%|██▋ | 395/1500 [04:16<11:52, 1.55it/s, loss=0.11, lr=1] \nSteps: 26%|██▋ | 396/1500 [04:17<11:53, 1.55it/s, loss=0.11, lr=1]\nSteps: 26%|██▋ | 396/1500 [04:17<11:53, 1.55it/s, loss=0.109, lr=1]\nSteps: 26%|██▋ | 397/1500 [04:17<11:52, 1.55it/s, loss=0.109, lr=1]\nSteps: 26%|██▋ | 397/1500 [04:17<11:52, 1.55it/s, loss=0.0939, lr=1]\nSteps: 27%|██▋ | 398/1500 [04:18<11:51, 1.55it/s, loss=0.0939, lr=1]\nSteps: 27%|██▋ | 398/1500 [04:18<11:51, 1.55it/s, loss=0.0673, lr=1]\nSteps: 27%|██▋ | 399/1500 [04:19<11:51, 1.55it/s, loss=0.0673, lr=1]\nSteps: 27%|██▋ | 399/1500 [04:19<11:51, 1.55it/s, loss=0.0303, lr=1]\nSteps: 27%|██▋ | 400/1500 [04:19<11:49, 1.55it/s, loss=0.0303, lr=1]\nSteps: 27%|██▋ | 400/1500 [04:19<11:49, 1.55it/s, loss=0.0659, lr=1]\nSteps: 27%|██▋ | 401/1500 [04:20<11:54, 1.54it/s, loss=0.0659, lr=1]\nSteps: 27%|██▋ | 401/1500 [04:20<11:54, 1.54it/s, loss=0.103, lr=1] \nSteps: 27%|██▋ | 402/1500 [04:21<11:52, 1.54it/s, loss=0.103, lr=1]\nSteps: 27%|██▋ | 402/1500 [04:21<11:52, 1.54it/s, loss=0.0599, lr=1]\nSteps: 27%|██▋ | 403/1500 [04:21<11:52, 1.54it/s, loss=0.0599, lr=1]\nSteps: 27%|██▋ | 403/1500 [04:21<11:52, 1.54it/s, loss=0.16, lr=1] \nSteps: 27%|██▋ | 404/1500 [04:22<11:50, 1.54it/s, loss=0.16, lr=1]\nSteps: 27%|██▋ | 404/1500 [04:22<11:50, 1.54it/s, loss=0.0198, lr=1]\nSteps: 27%|██▋ | 405/1500 [04:23<11:48, 1.55it/s, loss=0.0198, lr=1]\nSteps: 27%|██▋ | 405/1500 [04:23<11:48, 1.55it/s, loss=0.0182, lr=1]\nSteps: 27%|██▋ | 406/1500 [04:23<11:46, 1.55it/s, loss=0.0182, lr=1]\nSteps: 27%|██▋ | 406/1500 [04:23<11:46, 1.55it/s, loss=0.0636, lr=1]\nSteps: 27%|██▋ | 407/1500 [04:24<11:46, 1.55it/s, loss=0.0636, lr=1]\nSteps: 27%|██▋ | 407/1500 [04:24<11:46, 1.55it/s, loss=0.114, lr=1] \nSteps: 27%|██▋ | 408/1500 [04:24<11:45, 1.55it/s, loss=0.114, lr=1]\nSteps: 27%|██▋ | 408/1500 [04:24<11:45, 1.55it/s, loss=0.0698, lr=1]\nSteps: 27%|██▋ | 409/1500 [04:25<11:44, 1.55it/s, loss=0.0698, lr=1]\nSteps: 27%|██▋ | 409/1500 [04:25<11:44, 1.55it/s, loss=0.153, lr=1] \nSteps: 27%|██▋ | 410/1500 [04:26<11:43, 1.55it/s, loss=0.153, lr=1]\nSteps: 27%|██▋ | 410/1500 [04:26<11:43, 1.55it/s, loss=0.118, lr=1]\nSteps: 27%|██▋ | 411/1500 [04:26<11:42, 1.55it/s, loss=0.118, lr=1]\nSteps: 27%|██▋ | 411/1500 [04:26<11:42, 1.55it/s, loss=0.139, lr=1]\nSteps: 27%|██▋ | 412/1500 [04:27<11:42, 1.55it/s, loss=0.139, lr=1]\nSteps: 27%|██▋ | 412/1500 [04:27<11:42, 1.55it/s, loss=0.143, lr=1]\nSteps: 28%|██▊ | 413/1500 [04:28<11:41, 1.55it/s, loss=0.143, lr=1]\nSteps: 28%|██▊ | 413/1500 [04:28<11:41, 1.55it/s, loss=0.3, lr=1] \nSteps: 28%|██▊ | 414/1500 [04:28<11:40, 1.55it/s, loss=0.3, lr=1]\nSteps: 28%|██▊ | 414/1500 [04:28<11:40, 1.55it/s, loss=0.0971, lr=1]\nSteps: 28%|██▊ | 415/1500 [04:29<11:40, 1.55it/s, loss=0.0971, lr=1]\nSteps: 28%|██▊ | 415/1500 [04:29<11:40, 1.55it/s, loss=0.135, lr=1] \nSteps: 28%|██▊ | 416/1500 [04:30<11:39, 1.55it/s, loss=0.135, lr=1]\nSteps: 28%|██▊ | 416/1500 [04:30<11:39, 1.55it/s, loss=0.0597, lr=1]\nSteps: 28%|██▊ | 417/1500 [04:30<11:42, 1.54it/s, loss=0.0597, lr=1]\nSteps: 28%|██▊ | 417/1500 [04:30<11:42, 1.54it/s, loss=0.113, lr=1] \nSteps: 28%|██▊ | 418/1500 [04:31<11:40, 1.55it/s, loss=0.113, lr=1]\nSteps: 28%|██▊ | 418/1500 [04:31<11:40, 1.55it/s, loss=0.125, lr=1]\nSteps: 28%|██▊ | 419/1500 [04:32<11:38, 1.55it/s, loss=0.125, lr=1]\nSteps: 28%|██▊ | 419/1500 [04:32<11:38, 1.55it/s, loss=0.075, lr=1]\nSteps: 28%|██▊ | 420/1500 [04:32<11:36, 1.55it/s, loss=0.075, lr=1]\nSteps: 28%|██▊ | 420/1500 [04:32<11:36, 1.55it/s, loss=0.159, lr=1]\nSteps: 28%|██▊ | 421/1500 [04:33<11:35, 1.55it/s, loss=0.159, lr=1]\nSteps: 28%|██▊ | 421/1500 [04:33<11:35, 1.55it/s, loss=0.15, lr=1] \nSteps: 28%|██▊ | 422/1500 [04:33<11:34, 1.55it/s, loss=0.15, lr=1]\nSteps: 28%|██▊ | 422/1500 [04:33<11:34, 1.55it/s, loss=0.099, lr=1]\nSteps: 28%|██▊ | 423/1500 [04:34<11:33, 1.55it/s, loss=0.099, lr=1]\nSteps: 28%|██▊ | 423/1500 [04:34<11:33, 1.55it/s, loss=0.156, lr=1]\nSteps: 28%|██▊ | 424/1500 [04:35<11:33, 1.55it/s, loss=0.156, lr=1]\nSteps: 28%|██▊ | 424/1500 [04:35<11:33, 1.55it/s, loss=0.0727, lr=1]\nSteps: 28%|██▊ | 425/1500 [04:35<11:34, 1.55it/s, loss=0.0727, lr=1]\nSteps: 28%|██▊ | 425/1500 [04:35<11:34, 1.55it/s, loss=0.143, lr=1] \nSteps: 28%|██▊ | 426/1500 [04:36<11:33, 1.55it/s, loss=0.143, lr=1]\nSteps: 28%|██▊ | 426/1500 [04:36<11:33, 1.55it/s, loss=0.113, lr=1]\nSteps: 28%|██▊ | 427/1500 [04:37<11:32, 1.55it/s, loss=0.113, lr=1]\nSteps: 28%|██▊ | 427/1500 [04:37<11:32, 1.55it/s, loss=0.209, lr=1]\nSteps: 29%|██▊ | 428/1500 [04:37<11:31, 1.55it/s, loss=0.209, lr=1]\nSteps: 29%|██▊ | 428/1500 [04:37<11:31, 1.55it/s, loss=0.438, lr=1]\nSteps: 29%|██▊ | 429/1500 [04:38<11:30, 1.55it/s, loss=0.438, lr=1]\nSteps: 29%|██▊ | 429/1500 [04:38<11:30, 1.55it/s, loss=0.169, lr=1]\nSteps: 29%|██▊ | 430/1500 [04:39<11:29, 1.55it/s, loss=0.169, lr=1]\nSteps: 29%|██▊ | 430/1500 [04:39<11:29, 1.55it/s, loss=0.0719, lr=1]\nSteps: 29%|██▊ | 431/1500 [04:39<11:28, 1.55it/s, loss=0.0719, lr=1]\nSteps: 29%|██▊ | 431/1500 [04:39<11:28, 1.55it/s, loss=0.166, lr=1] \nSteps: 29%|██▉ | 432/1500 [04:40<11:27, 1.55it/s, loss=0.166, lr=1]\nSteps: 29%|██▉ | 432/1500 [04:40<11:27, 1.55it/s, loss=0.184, lr=1]\nSteps: 29%|██▉ | 433/1500 [04:41<11:31, 1.54it/s, loss=0.184, lr=1]\nSteps: 29%|██▉ | 433/1500 [04:41<11:31, 1.54it/s, loss=0.16, lr=1] \nSteps: 29%|██▉ | 434/1500 [04:41<11:29, 1.55it/s, loss=0.16, lr=1]\nSteps: 29%|██▉ | 434/1500 [04:41<11:29, 1.55it/s, loss=0.348, lr=1]\nSteps: 29%|██▉ | 435/1500 [04:42<11:28, 1.55it/s, loss=0.348, lr=1]\nSteps: 29%|██▉ | 435/1500 [04:42<11:28, 1.55it/s, loss=0.141, lr=1]\nSteps: 29%|██▉ | 436/1500 [04:43<11:26, 1.55it/s, loss=0.141, lr=1]\nSteps: 29%|██▉ | 436/1500 [04:43<11:26, 1.55it/s, loss=0.101, lr=1]\nSteps: 29%|██▉ | 437/1500 [04:43<11:25, 1.55it/s, loss=0.101, lr=1]\nSteps: 29%|██▉ | 437/1500 [04:43<11:25, 1.55it/s, loss=0.34, lr=1] \nSteps: 29%|██▉ | 438/1500 [04:44<11:26, 1.55it/s, loss=0.34, lr=1]\nSteps: 29%|██▉ | 438/1500 [04:44<11:26, 1.55it/s, loss=0.202, lr=1]\nSteps: 29%|██▉ | 439/1500 [04:44<11:24, 1.55it/s, loss=0.202, lr=1]\nSteps: 29%|██▉ | 439/1500 [04:44<11:24, 1.55it/s, loss=0.178, lr=1]\nSteps: 29%|██▉ | 440/1500 [04:45<11:22, 1.55it/s, loss=0.178, lr=1]\nSteps: 29%|██▉ | 440/1500 [04:45<11:22, 1.55it/s, loss=0.0855, lr=1]\nSteps: 29%|██▉ | 441/1500 [04:46<11:22, 1.55it/s, loss=0.0855, lr=1]\nSteps: 29%|██▉ | 441/1500 [04:46<11:22, 1.55it/s, loss=0.123, lr=1] \nSteps: 29%|██▉ | 442/1500 [04:46<11:22, 1.55it/s, loss=0.123, lr=1]\nSteps: 29%|██▉ | 442/1500 [04:46<11:22, 1.55it/s, loss=0.0254, lr=1]\nSteps: 30%|██▉ | 443/1500 [04:47<11:22, 1.55it/s, loss=0.0254, lr=1]\nSteps: 30%|██▉ | 443/1500 [04:47<11:22, 1.55it/s, loss=0.168, lr=1] \nSteps: 30%|██▉ | 444/1500 [04:48<11:21, 1.55it/s, loss=0.168, lr=1]\nSteps: 30%|██▉ | 444/1500 [04:48<11:21, 1.55it/s, loss=0.0671, lr=1]\nSteps: 30%|██▉ | 445/1500 [04:48<11:20, 1.55it/s, loss=0.0671, lr=1]\nSteps: 30%|██▉ | 445/1500 [04:48<11:20, 1.55it/s, loss=0.112, lr=1] \nSteps: 30%|██▉ | 446/1500 [04:49<11:19, 1.55it/s, loss=0.112, lr=1]\nSteps: 30%|██▉ | 446/1500 [04:49<11:19, 1.55it/s, loss=0.246, lr=1]\nSteps: 30%|██▉ | 447/1500 [04:50<11:18, 1.55it/s, loss=0.246, lr=1]\nSteps: 30%|██▉ | 447/1500 [04:50<11:18, 1.55it/s, loss=0.496, lr=1]\nSteps: 30%|██▉ | 448/1500 [04:50<11:17, 1.55it/s, loss=0.496, lr=1]\nSteps: 30%|██▉ | 448/1500 [04:50<11:17, 1.55it/s, loss=0.137, lr=1]\nSteps: 30%|██▉ | 449/1500 [04:51<11:21, 1.54it/s, loss=0.137, lr=1]\nSteps: 30%|██▉ | 449/1500 [04:51<11:21, 1.54it/s, loss=0.157, lr=1]\nSteps: 30%|███ | 450/1500 [04:52<11:19, 1.55it/s, loss=0.157, lr=1]\nSteps: 30%|███ | 450/1500 [04:52<11:19, 1.55it/s, loss=0.0549, lr=1]\nSteps: 30%|███ | 451/1500 [04:52<11:17, 1.55it/s, loss=0.0549, lr=1]\nSteps: 30%|███ | 451/1500 [04:52<11:17, 1.55it/s, loss=0.106, lr=1] \nSteps: 30%|███ | 452/1500 [04:53<11:16, 1.55it/s, loss=0.106, lr=1]\nSteps: 30%|███ | 452/1500 [04:53<11:16, 1.55it/s, loss=0.134, lr=1]\nSteps: 30%|███ | 453/1500 [04:53<11:14, 1.55it/s, loss=0.134, lr=1]\nSteps: 30%|███ | 453/1500 [04:53<11:14, 1.55it/s, loss=0.11, lr=1] \nSteps: 30%|███ | 454/1500 [04:54<11:13, 1.55it/s, loss=0.11, lr=1]\nSteps: 30%|███ | 454/1500 [04:54<11:13, 1.55it/s, loss=0.135, lr=1]\nSteps: 30%|███ | 455/1500 [04:55<11:12, 1.55it/s, loss=0.135, lr=1]\nSteps: 30%|███ | 455/1500 [04:55<11:12, 1.55it/s, loss=0.218, lr=1]\nSteps: 30%|███ | 456/1500 [04:55<11:12, 1.55it/s, loss=0.218, lr=1]\nSteps: 30%|███ | 456/1500 [04:55<11:12, 1.55it/s, loss=0.127, lr=1]\nSteps: 30%|███ | 457/1500 [04:56<11:12, 1.55it/s, loss=0.127, lr=1]\nSteps: 30%|███ | 457/1500 [04:56<11:12, 1.55it/s, loss=0.118, lr=1]\nSteps: 31%|███ | 458/1500 [04:57<11:12, 1.55it/s, loss=0.118, lr=1]\nSteps: 31%|███ | 458/1500 [04:57<11:12, 1.55it/s, loss=0.0756, lr=1]\nSteps: 31%|███ | 459/1500 [04:57<11:11, 1.55it/s, loss=0.0756, lr=1]\nSteps: 31%|███ | 459/1500 [04:57<11:11, 1.55it/s, loss=0.135, lr=1] \nSteps: 31%|███ | 460/1500 [04:58<11:10, 1.55it/s, loss=0.135, lr=1]\nSteps: 31%|███ | 460/1500 [04:58<11:10, 1.55it/s, loss=0.093, lr=1]\nSteps: 31%|███ | 461/1500 [04:59<11:09, 1.55it/s, loss=0.093, lr=1]\nSteps: 31%|███ | 461/1500 [04:59<11:09, 1.55it/s, loss=0.128, lr=1]\nSteps: 31%|███ | 462/1500 [04:59<11:09, 1.55it/s, loss=0.128, lr=1]\nSteps: 31%|███ | 462/1500 [04:59<11:09, 1.55it/s, loss=0.0816, lr=1]\nSteps: 31%|███ | 463/1500 [05:00<11:08, 1.55it/s, loss=0.0816, lr=1]\nSteps: 31%|███ | 463/1500 [05:00<11:08, 1.55it/s, loss=0.118, lr=1] \nSteps: 31%|███ | 464/1500 [05:01<11:07, 1.55it/s, loss=0.118, lr=1]\nSteps: 31%|███ | 464/1500 [05:01<11:07, 1.55it/s, loss=0.0988, lr=1]\nSteps: 31%|███ | 465/1500 [05:01<11:10, 1.54it/s, loss=0.0988, lr=1]\nSteps: 31%|███ | 465/1500 [05:01<11:10, 1.54it/s, loss=0.088, lr=1] \nSteps: 31%|███ | 466/1500 [05:02<11:08, 1.55it/s, loss=0.088, lr=1]\nSteps: 31%|███ | 466/1500 [05:02<11:08, 1.55it/s, loss=0.0776, lr=1]\nSteps: 31%|███ | 467/1500 [05:03<11:07, 1.55it/s, loss=0.0776, lr=1]\nSteps: 31%|███ | 467/1500 [05:03<11:07, 1.55it/s, loss=0.0928, lr=1]\nSteps: 31%|███ | 468/1500 [05:03<11:05, 1.55it/s, loss=0.0928, lr=1]\nSteps: 31%|███ | 468/1500 [05:03<11:05, 1.55it/s, loss=0.0588, lr=1]\nSteps: 31%|███▏ | 469/1500 [05:04<11:05, 1.55it/s, loss=0.0588, lr=1]\nSteps: 31%|███▏ | 469/1500 [05:04<11:05, 1.55it/s, loss=0.172, lr=1] \nSteps: 31%|███▏ | 470/1500 [05:04<11:04, 1.55it/s, loss=0.172, lr=1]\nSteps: 31%|███▏ | 470/1500 [05:04<11:04, 1.55it/s, loss=0.0969, lr=1]\nSteps: 31%|███▏ | 471/1500 [05:05<11:02, 1.55it/s, loss=0.0969, lr=1]\nSteps: 31%|███▏ | 471/1500 [05:05<11:02, 1.55it/s, loss=0.106, lr=1] \nSteps: 31%|███▏ | 472/1500 [05:06<11:02, 1.55it/s, loss=0.106, lr=1]\nSteps: 31%|███▏ | 472/1500 [05:06<11:02, 1.55it/s, loss=0.172, lr=1]\nSteps: 32%|███▏ | 473/1500 [05:06<11:01, 1.55it/s, loss=0.172, lr=1]\nSteps: 32%|███▏ | 473/1500 [05:06<11:01, 1.55it/s, loss=0.074, lr=1]\nSteps: 32%|███▏ | 474/1500 [05:07<11:01, 1.55it/s, loss=0.074, lr=1]\nSteps: 32%|███▏ | 474/1500 [05:07<11:01, 1.55it/s, loss=0.255, lr=1]\nSteps: 32%|███▏ | 475/1500 [05:08<11:00, 1.55it/s, loss=0.255, lr=1]\nSteps: 32%|███▏ | 475/1500 [05:08<11:00, 1.55it/s, loss=0.166, lr=1]\nSteps: 32%|███▏ | 476/1500 [05:08<11:00, 1.55it/s, loss=0.166, lr=1]\nSteps: 32%|███▏ | 476/1500 [05:08<11:00, 1.55it/s, loss=0.0925, lr=1]\nSteps: 32%|███▏ | 477/1500 [05:09<10:59, 1.55it/s, loss=0.0925, lr=1]\nSteps: 32%|███▏ | 477/1500 [05:09<10:59, 1.55it/s, loss=0.0624, lr=1]\nSteps: 32%|███▏ | 478/1500 [05:10<10:58, 1.55it/s, loss=0.0624, lr=1]\nSteps: 32%|███▏ | 478/1500 [05:10<10:58, 1.55it/s, loss=0.134, lr=1] \nSteps: 32%|███▏ | 479/1500 [05:10<10:57, 1.55it/s, loss=0.134, lr=1]\nSteps: 32%|███▏ | 479/1500 [05:10<10:57, 1.55it/s, loss=0.244, lr=1]\nSteps: 32%|███▏ | 480/1500 [05:11<10:57, 1.55it/s, loss=0.244, lr=1]\nSteps: 32%|███▏ | 480/1500 [05:11<10:57, 1.55it/s, loss=0.18, lr=1] \nSteps: 32%|███▏ | 481/1500 [05:12<11:00, 1.54it/s, loss=0.18, lr=1]\nSteps: 32%|███▏ | 481/1500 [05:12<11:00, 1.54it/s, loss=0.0471, lr=1]\nSteps: 32%|███▏ | 482/1500 [05:12<10:58, 1.55it/s, loss=0.0471, lr=1]\nSteps: 32%|███▏ | 482/1500 [05:12<10:58, 1.55it/s, loss=0.295, lr=1] \nSteps: 32%|███▏ | 483/1500 [05:13<10:57, 1.55it/s, loss=0.295, lr=1]\nSteps: 32%|███▏ | 483/1500 [05:13<10:57, 1.55it/s, loss=0.28, lr=1] \nSteps: 32%|███▏ | 484/1500 [05:14<11:43, 1.44it/s, loss=0.28, lr=1]\nSteps: 32%|███▏ | 484/1500 [05:14<11:43, 1.44it/s, loss=0.117, lr=1]\nSteps: 32%|███▏ | 485/1500 [05:14<11:28, 1.47it/s, loss=0.117, lr=1]\nSteps: 32%|███▏ | 485/1500 [05:14<11:28, 1.47it/s, loss=0.183, lr=1]\nSteps: 32%|███▏ | 486/1500 [05:15<11:17, 1.50it/s, loss=0.183, lr=1]\nSteps: 32%|███▏ | 486/1500 [05:15<11:17, 1.50it/s, loss=0.0755, lr=1]\nSteps: 32%|███▏ | 487/1500 [05:16<11:09, 1.51it/s, loss=0.0755, lr=1]\nSteps: 32%|███▏ | 487/1500 [05:16<11:09, 1.51it/s, loss=0.112, lr=1] \nSteps: 33%|███▎ | 488/1500 [05:16<11:03, 1.53it/s, loss=0.112, lr=1]\nSteps: 33%|███▎ | 488/1500 [05:16<11:03, 1.53it/s, loss=0.191, lr=1]\nSteps: 33%|███▎ | 489/1500 [05:17<10:59, 1.53it/s, loss=0.191, lr=1]\nSteps: 33%|███▎ | 489/1500 [05:17<10:59, 1.53it/s, loss=0.172, lr=1]\nSteps: 33%|███▎ | 490/1500 [05:18<10:56, 1.54it/s, loss=0.172, lr=1]\nSteps: 33%|███▎ | 490/1500 [05:18<10:56, 1.54it/s, loss=0.108, lr=1]\nSteps: 33%|███▎ | 491/1500 [05:18<10:53, 1.54it/s, loss=0.108, lr=1]\nSteps: 33%|███▎ | 491/1500 [05:18<10:53, 1.54it/s, loss=0.118, lr=1]\nSteps: 33%|███▎ | 492/1500 [05:19<10:52, 1.55it/s, loss=0.118, lr=1]\nSteps: 33%|███▎ | 492/1500 [05:19<10:52, 1.55it/s, loss=0.124, lr=1]\nSteps: 33%|███▎ | 493/1500 [05:19<10:50, 1.55it/s, loss=0.124, lr=1]\nSteps: 33%|███▎ | 493/1500 [05:19<10:50, 1.55it/s, loss=0.0977, lr=1]\nSteps: 33%|███▎ | 494/1500 [05:20<10:49, 1.55it/s, loss=0.0977, lr=1]\nSteps: 33%|███▎ | 494/1500 [05:20<10:49, 1.55it/s, loss=0.149, lr=1] \nSteps: 33%|███▎ | 495/1500 [05:21<10:49, 1.55it/s, loss=0.149, lr=1]\nSteps: 33%|███▎ | 495/1500 [05:21<10:49, 1.55it/s, loss=0.127, lr=1]\nSteps: 33%|███▎ | 496/1500 [05:21<10:48, 1.55it/s, loss=0.127, lr=1]\nSteps: 33%|███▎ | 496/1500 [05:21<10:48, 1.55it/s, loss=0.0968, lr=1]\nSteps: 33%|███▎ | 497/1500 [05:22<10:51, 1.54it/s, loss=0.0968, lr=1]\nSteps: 33%|███▎ | 497/1500 [05:22<10:51, 1.54it/s, loss=0.246, lr=1] \nSteps: 33%|███▎ | 498/1500 [05:23<10:49, 1.54it/s, loss=0.246, lr=1]\nSteps: 33%|███▎ | 498/1500 [05:23<10:49, 1.54it/s, loss=0.211, lr=1]\nSteps: 33%|███▎ | 499/1500 [05:23<10:47, 1.54it/s, loss=0.211, lr=1]\nSteps: 33%|███▎ | 499/1500 [05:23<10:47, 1.54it/s, loss=0.0595, lr=1]\nSteps: 33%|███▎ | 500/1500 [05:24<10:47, 1.54it/s, loss=0.0595, lr=1]\nSteps: 33%|███▎ | 500/1500 [05:24<10:47, 1.54it/s, loss=0.217, lr=1] \nSteps: 33%|███▎ | 501/1500 [05:25<10:46, 1.55it/s, loss=0.217, lr=1]\nSteps: 33%|███▎ | 501/1500 [05:25<10:46, 1.55it/s, loss=0.234, lr=1]\nSteps: 33%|███▎ | 502/1500 [05:25<10:45, 1.55it/s, loss=0.234, lr=1]\nSteps: 33%|███▎ | 502/1500 [05:25<10:45, 1.55it/s, loss=0.189, lr=1]\nSteps: 34%|███▎ | 503/1500 [05:26<10:45, 1.55it/s, loss=0.189, lr=1]\nSteps: 34%|███▎ | 503/1500 [05:26<10:45, 1.55it/s, loss=0.0568, lr=1]\nSteps: 34%|███▎ | 504/1500 [05:27<10:43, 1.55it/s, loss=0.0568, lr=1]\nSteps: 34%|███▎ | 504/1500 [05:27<10:43, 1.55it/s, loss=0.124, lr=1] \nSteps: 34%|███▎ | 505/1500 [05:27<10:42, 1.55it/s, loss=0.124, lr=1]\nSteps: 34%|███▎ | 505/1500 [05:27<10:42, 1.55it/s, loss=0.144, lr=1]\nSteps: 34%|███▎ | 506/1500 [05:28<10:41, 1.55it/s, loss=0.144, lr=1]\nSteps: 34%|███▎ | 506/1500 [05:28<10:41, 1.55it/s, loss=0.0346, lr=1]\nSteps: 34%|███▍ | 507/1500 [05:28<10:41, 1.55it/s, loss=0.0346, lr=1]\nSteps: 34%|███▍ | 507/1500 [05:28<10:41, 1.55it/s, loss=0.0517, lr=1]\nSteps: 34%|███▍ | 508/1500 [05:29<10:40, 1.55it/s, loss=0.0517, lr=1]\nSteps: 34%|███▍ | 508/1500 [05:29<10:40, 1.55it/s, loss=0.119, lr=1] \nSteps: 34%|███▍ | 509/1500 [05:30<10:39, 1.55it/s, loss=0.119, lr=1]\nSteps: 34%|███▍ | 509/1500 [05:30<10:39, 1.55it/s, loss=0.095, lr=1]\nSteps: 34%|███▍ | 510/1500 [05:30<10:38, 1.55it/s, loss=0.095, lr=1]\nSteps: 34%|███▍ | 510/1500 [05:30<10:38, 1.55it/s, loss=0.18, lr=1] \nSteps: 34%|███▍ | 511/1500 [05:31<10:38, 1.55it/s, loss=0.18, lr=1]\nSteps: 34%|███▍ | 511/1500 [05:31<10:38, 1.55it/s, loss=0.247, lr=1]\nSteps: 34%|███▍ | 512/1500 [05:32<10:37, 1.55it/s, loss=0.247, lr=1]\nSteps: 34%|███▍ | 512/1500 [05:32<10:37, 1.55it/s, loss=0.0814, lr=1]\nSteps: 34%|███▍ | 513/1500 [05:32<10:40, 1.54it/s, loss=0.0814, lr=1]\nSteps: 34%|███▍ | 513/1500 [05:32<10:40, 1.54it/s, loss=0.135, lr=1] \nSteps: 34%|███▍ | 514/1500 [05:33<10:39, 1.54it/s, loss=0.135, lr=1]\nSteps: 34%|███▍ | 514/1500 [05:33<10:39, 1.54it/s, loss=0.201, lr=1]\nSteps: 34%|███▍ | 515/1500 [05:34<10:40, 1.54it/s, loss=0.201, lr=1]\nSteps: 34%|███▍ | 515/1500 [05:34<10:40, 1.54it/s, loss=0.198, lr=1]\nSteps: 34%|███▍ | 516/1500 [05:34<10:43, 1.53it/s, loss=0.198, lr=1]\nSteps: 34%|███▍ | 516/1500 [05:34<10:43, 1.53it/s, loss=0.0714, lr=1]\nSteps: 34%|███▍ | 517/1500 [05:35<10:40, 1.54it/s, loss=0.0714, lr=1]\nSteps: 34%|███▍ | 517/1500 [05:35<10:40, 1.54it/s, loss=0.112, lr=1] \nSteps: 35%|███▍ | 518/1500 [05:36<10:37, 1.54it/s, loss=0.112, lr=1]\nSteps: 35%|███▍ | 518/1500 [05:36<10:37, 1.54it/s, loss=0.0658, lr=1]\nSteps: 35%|███▍ | 519/1500 [05:36<10:35, 1.54it/s, loss=0.0658, lr=1]\nSteps: 35%|███▍ | 519/1500 [05:36<10:35, 1.54it/s, loss=0.157, lr=1] \nSteps: 35%|███▍ | 520/1500 [05:37<10:34, 1.54it/s, loss=0.157, lr=1]\nSteps: 35%|███▍ | 520/1500 [05:37<10:34, 1.54it/s, loss=0.0401, lr=1]\nSteps: 35%|███▍ | 521/1500 [05:38<10:33, 1.54it/s, loss=0.0401, lr=1]\nSteps: 35%|███▍ | 521/1500 [05:38<10:33, 1.54it/s, loss=0.0419, lr=1]\nSteps: 35%|███▍ | 522/1500 [05:38<10:31, 1.55it/s, loss=0.0419, lr=1]\nSteps: 35%|███▍ | 522/1500 [05:38<10:31, 1.55it/s, loss=0.129, lr=1] \nSteps: 35%|███▍ | 523/1500 [05:39<10:30, 1.55it/s, loss=0.129, lr=1]\nSteps: 35%|███▍ | 523/1500 [05:39<10:30, 1.55it/s, loss=0.109, lr=1]\nSteps: 35%|███▍ | 524/1500 [05:40<10:29, 1.55it/s, loss=0.109, lr=1]\nSteps: 35%|███▍ | 524/1500 [05:40<10:29, 1.55it/s, loss=0.0802, lr=1]\nSteps: 35%|███▌ | 525/1500 [05:40<10:28, 1.55it/s, loss=0.0802, lr=1]\nSteps: 35%|███▌ | 525/1500 [05:40<10:28, 1.55it/s, loss=0.156, lr=1] \nSteps: 35%|███▌ | 526/1500 [05:41<10:27, 1.55it/s, loss=0.156, lr=1]\nSteps: 35%|███▌ | 526/1500 [05:41<10:27, 1.55it/s, loss=0.115, lr=1]\nSteps: 35%|███▌ | 527/1500 [05:41<10:26, 1.55it/s, loss=0.115, lr=1]\nSteps: 35%|███▌ | 527/1500 [05:41<10:26, 1.55it/s, loss=0.279, lr=1]\nSteps: 35%|███▌ | 528/1500 [05:42<10:25, 1.55it/s, loss=0.279, lr=1]\nSteps: 35%|███▌ | 528/1500 [05:42<10:25, 1.55it/s, loss=0.237, lr=1]\nSteps: 35%|███▌ | 529/1500 [05:43<10:29, 1.54it/s, loss=0.237, lr=1]\nSteps: 35%|███▌ | 529/1500 [05:43<10:29, 1.54it/s, loss=0.057, lr=1]\nSteps: 35%|███▌ | 530/1500 [05:43<10:27, 1.55it/s, loss=0.057, lr=1]\nSteps: 35%|███▌ | 530/1500 [05:43<10:27, 1.55it/s, loss=0.179, lr=1]\nSteps: 35%|███▌ | 531/1500 [05:44<10:25, 1.55it/s, loss=0.179, lr=1]\nSteps: 35%|███▌ | 531/1500 [05:44<10:25, 1.55it/s, loss=0.139, lr=1]\nSteps: 35%|███▌ | 532/1500 [05:45<10:24, 1.55it/s, loss=0.139, lr=1]\nSteps: 35%|███▌ | 532/1500 [05:45<10:24, 1.55it/s, loss=0.126, lr=1]\nSteps: 36%|███▌ | 533/1500 [05:45<10:23, 1.55it/s, loss=0.126, lr=1]\nSteps: 36%|███▌ | 533/1500 [05:45<10:23, 1.55it/s, loss=0.0814, lr=1]\nSteps: 36%|███▌ | 534/1500 [05:46<10:22, 1.55it/s, loss=0.0814, lr=1]\nSteps: 36%|███▌ | 534/1500 [05:46<10:22, 1.55it/s, loss=0.191, lr=1] \nSteps: 36%|███▌ | 535/1500 [05:47<10:22, 1.55it/s, loss=0.191, lr=1]\nSteps: 36%|███▌ | 535/1500 [05:47<10:22, 1.55it/s, loss=0.171, lr=1]\nSteps: 36%|███▌ | 536/1500 [05:47<10:21, 1.55it/s, loss=0.171, lr=1]\nSteps: 36%|███▌ | 536/1500 [05:47<10:21, 1.55it/s, loss=0.129, lr=1]\nSteps: 36%|███▌ | 537/1500 [05:48<10:20, 1.55it/s, loss=0.129, lr=1]\nSteps: 36%|███▌ | 537/1500 [05:48<10:20, 1.55it/s, loss=0.067, lr=1]\nSteps: 36%|███▌ | 538/1500 [05:49<10:19, 1.55it/s, loss=0.067, lr=1]\nSteps: 36%|███▌ | 538/1500 [05:49<10:19, 1.55it/s, loss=0.279, lr=1]\nSteps: 36%|███▌ | 539/1500 [05:49<10:18, 1.55it/s, loss=0.279, lr=1]\nSteps: 36%|███▌ | 539/1500 [05:49<10:18, 1.55it/s, loss=0.277, lr=1]\nSteps: 36%|███▌ | 540/1500 [05:50<10:17, 1.55it/s, loss=0.277, lr=1]\nSteps: 36%|███▌ | 540/1500 [05:50<10:17, 1.55it/s, loss=0.257, lr=1]\nSteps: 36%|███▌ | 541/1500 [05:50<10:16, 1.55it/s, loss=0.257, lr=1]\nSteps: 36%|███▌ | 541/1500 [05:50<10:16, 1.55it/s, loss=0.348, lr=1]\nSteps: 36%|███▌ | 542/1500 [05:51<10:15, 1.56it/s, loss=0.348, lr=1]\nSteps: 36%|███▌ | 542/1500 [05:51<10:15, 1.56it/s, loss=0.061, lr=1]\nSteps: 36%|███▌ | 543/1500 [05:52<10:15, 1.55it/s, loss=0.061, lr=1]\nSteps: 36%|███▌ | 543/1500 [05:52<10:15, 1.55it/s, loss=0.0925, lr=1]\nSteps: 36%|███▋ | 544/1500 [05:52<10:15, 1.55it/s, loss=0.0925, lr=1]\nSteps: 36%|███▋ | 544/1500 [05:52<10:15, 1.55it/s, loss=0.206, lr=1] \nSteps: 36%|███▋ | 545/1500 [05:53<10:17, 1.55it/s, loss=0.206, lr=1]\nSteps: 36%|███▋ | 545/1500 [05:53<10:17, 1.55it/s, loss=0.0522, lr=1]\nSteps: 36%|███▋ | 546/1500 [05:54<10:16, 1.55it/s, loss=0.0522, lr=1]\nSteps: 36%|███▋ | 546/1500 [05:54<10:16, 1.55it/s, loss=0.0838, lr=1]\nSteps: 36%|███▋ | 547/1500 [05:54<10:17, 1.54it/s, loss=0.0838, lr=1]\nSteps: 36%|███▋ | 547/1500 [05:54<10:17, 1.54it/s, loss=0.0663, lr=1]\nSteps: 37%|███▋ | 548/1500 [05:55<10:15, 1.55it/s, loss=0.0663, lr=1]\nSteps: 37%|███▋ | 548/1500 [05:55<10:15, 1.55it/s, loss=0.157, lr=1] \nSteps: 37%|███▋ | 549/1500 [05:56<10:15, 1.55it/s, loss=0.157, lr=1]\nSteps: 37%|███▋ | 549/1500 [05:56<10:15, 1.55it/s, loss=0.0236, lr=1]\nSteps: 37%|███▋ | 550/1500 [05:56<10:15, 1.54it/s, loss=0.0236, lr=1]\nSteps: 37%|███▋ | 550/1500 [05:56<10:15, 1.54it/s, loss=0.087, lr=1] \nSteps: 37%|███▋ | 551/1500 [05:57<10:14, 1.55it/s, loss=0.087, lr=1]\nSteps: 37%|███▋ | 551/1500 [05:57<10:14, 1.55it/s, loss=0.0453, lr=1]\nSteps: 37%|███▋ | 552/1500 [05:58<10:12, 1.55it/s, loss=0.0453, lr=1]\nSteps: 37%|███▋ | 552/1500 [05:58<10:12, 1.55it/s, loss=0.146, lr=1] \nSteps: 37%|███▋ | 553/1500 [05:58<10:11, 1.55it/s, loss=0.146, lr=1]\nSteps: 37%|███▋ | 553/1500 [05:58<10:11, 1.55it/s, loss=0.154, lr=1]\nSteps: 37%|███▋ | 554/1500 [05:59<10:10, 1.55it/s, loss=0.154, lr=1]\nSteps: 37%|███▋ | 554/1500 [05:59<10:10, 1.55it/s, loss=0.173, lr=1]\nSteps: 37%|███▋ | 555/1500 [06:00<10:09, 1.55it/s, loss=0.173, lr=1]\nSteps: 37%|███▋ | 555/1500 [06:00<10:09, 1.55it/s, loss=0.0916, lr=1]\nSteps: 37%|███▋ | 556/1500 [06:00<10:08, 1.55it/s, loss=0.0916, lr=1]\nSteps: 37%|███▋ | 556/1500 [06:00<10:08, 1.55it/s, loss=0.023, lr=1] \nSteps: 37%|███▋ | 557/1500 [06:01<10:08, 1.55it/s, loss=0.023, lr=1]\nSteps: 37%|███▋ | 557/1500 [06:01<10:08, 1.55it/s, loss=0.14, lr=1] \nSteps: 37%|███▋ | 558/1500 [06:01<10:07, 1.55it/s, loss=0.14, lr=1]\nSteps: 37%|███▋ | 558/1500 [06:01<10:07, 1.55it/s, loss=0.125, lr=1]\nSteps: 37%|███▋ | 559/1500 [06:02<10:06, 1.55it/s, loss=0.125, lr=1]\nSteps: 37%|███▋ | 559/1500 [06:02<10:06, 1.55it/s, loss=0.146, lr=1]\nSteps: 37%|███▋ | 560/1500 [06:03<10:05, 1.55it/s, loss=0.146, lr=1]\nSteps: 37%|███▋ | 560/1500 [06:03<10:05, 1.55it/s, loss=0.0552, lr=1]\nSteps: 37%|███▋ | 561/1500 [06:03<10:09, 1.54it/s, loss=0.0552, lr=1]\nSteps: 37%|███▋ | 561/1500 [06:03<10:09, 1.54it/s, loss=0.207, lr=1] \nSteps: 37%|███▋ | 562/1500 [06:04<10:07, 1.54it/s, loss=0.207, lr=1]\nSteps: 37%|███▋ | 562/1500 [06:04<10:07, 1.54it/s, loss=0.189, lr=1]\nSteps: 38%|███▊ | 563/1500 [06:05<10:05, 1.55it/s, loss=0.189, lr=1]\nSteps: 38%|███▊ | 563/1500 [06:05<10:05, 1.55it/s, loss=0.168, lr=1]\nSteps: 38%|███▊ | 564/1500 [06:05<10:04, 1.55it/s, loss=0.168, lr=1]\nSteps: 38%|███▊ | 564/1500 [06:05<10:04, 1.55it/s, loss=0.0482, lr=1]\nSteps: 38%|███▊ | 565/1500 [06:06<10:03, 1.55it/s, loss=0.0482, lr=1]\nSteps: 38%|███▊ | 565/1500 [06:06<10:03, 1.55it/s, loss=0.113, lr=1] \nSteps: 38%|███▊ | 566/1500 [06:07<10:02, 1.55it/s, loss=0.113, lr=1]\nSteps: 38%|███▊ | 566/1500 [06:07<10:02, 1.55it/s, loss=0.108, lr=1]\nSteps: 38%|███▊ | 567/1500 [06:07<10:01, 1.55it/s, loss=0.108, lr=1]\nSteps: 38%|███▊ | 567/1500 [06:07<10:01, 1.55it/s, loss=0.137, lr=1]\nSteps: 38%|███▊ | 568/1500 [06:08<10:00, 1.55it/s, loss=0.137, lr=1]\nSteps: 38%|███▊ | 568/1500 [06:08<10:00, 1.55it/s, loss=0.167, lr=1]\nSteps: 38%|███▊ | 569/1500 [06:09<09:59, 1.55it/s, loss=0.167, lr=1]\nSteps: 38%|███▊ | 569/1500 [06:09<09:59, 1.55it/s, loss=0.195, lr=1]\nSteps: 38%|███▊ | 570/1500 [06:09<09:59, 1.55it/s, loss=0.195, lr=1]\nSteps: 38%|███▊ | 570/1500 [06:09<09:59, 1.55it/s, loss=0.104, lr=1]\nSteps: 38%|███▊ | 571/1500 [06:10<09:58, 1.55it/s, loss=0.104, lr=1]\nSteps: 38%|███▊ | 571/1500 [06:10<09:58, 1.55it/s, loss=0.129, lr=1]\nSteps: 38%|███▊ | 572/1500 [06:10<09:58, 1.55it/s, loss=0.129, lr=1]\nSteps: 38%|███▊ | 572/1500 [06:10<09:58, 1.55it/s, loss=0.0649, lr=1]\nSteps: 38%|███▊ | 573/1500 [06:11<09:57, 1.55it/s, loss=0.0649, lr=1]\nSteps: 38%|███▊ | 573/1500 [06:11<09:57, 1.55it/s, loss=0.156, lr=1] \nSteps: 38%|███▊ | 574/1500 [06:12<09:57, 1.55it/s, loss=0.156, lr=1]\nSteps: 38%|███▊ | 574/1500 [06:12<09:57, 1.55it/s, loss=0.0962, lr=1]\nSteps: 38%|███▊ | 575/1500 [06:12<09:56, 1.55it/s, loss=0.0962, lr=1]\nSteps: 38%|███▊ | 575/1500 [06:12<09:56, 1.55it/s, loss=0.139, lr=1] \nSteps: 38%|███▊ | 576/1500 [06:13<09:55, 1.55it/s, loss=0.139, lr=1]\nSteps: 38%|███▊ | 576/1500 [06:13<09:55, 1.55it/s, loss=0.232, lr=1]\nSteps: 38%|███▊ | 577/1500 [06:14<09:58, 1.54it/s, loss=0.232, lr=1]\nSteps: 38%|███▊ | 577/1500 [06:14<09:58, 1.54it/s, loss=0.174, lr=1]\nSteps: 39%|███▊ | 578/1500 [06:14<09:58, 1.54it/s, loss=0.174, lr=1]\nSteps: 39%|███▊ | 578/1500 [06:14<09:58, 1.54it/s, loss=0.0637, lr=1]\nSteps: 39%|███▊ | 579/1500 [06:15<09:56, 1.54it/s, loss=0.0637, lr=1]\nSteps: 39%|███▊ | 579/1500 [06:15<09:56, 1.54it/s, loss=0.11, lr=1] \nSteps: 39%|███▊ | 580/1500 [06:16<09:54, 1.55it/s, loss=0.11, lr=1]\nSteps: 39%|███▊ | 580/1500 [06:16<09:54, 1.55it/s, loss=0.152, lr=1]\nSteps: 39%|███▊ | 581/1500 [06:16<09:53, 1.55it/s, loss=0.152, lr=1]\nSteps: 39%|███▊ | 581/1500 [06:16<09:53, 1.55it/s, loss=0.166, lr=1]\nSteps: 39%|███▉ | 582/1500 [06:17<09:52, 1.55it/s, loss=0.166, lr=1]\nSteps: 39%|███▉ | 582/1500 [06:17<09:52, 1.55it/s, loss=0.0905, lr=1]\nSteps: 39%|███▉ | 583/1500 [06:18<09:52, 1.55it/s, loss=0.0905, lr=1]\nSteps: 39%|███▉ | 583/1500 [06:18<09:52, 1.55it/s, loss=0.112, lr=1] \nSteps: 39%|███▉ | 584/1500 [06:18<09:51, 1.55it/s, loss=0.112, lr=1]\nSteps: 39%|███▉ | 584/1500 [06:18<09:51, 1.55it/s, loss=0.22, lr=1] \nSteps: 39%|███▉ | 585/1500 [06:19<09:50, 1.55it/s, loss=0.22, lr=1]\nSteps: 39%|███▉ | 585/1500 [06:19<09:50, 1.55it/s, loss=0.281, lr=1]\nSteps: 39%|███▉ | 586/1500 [06:20<09:49, 1.55it/s, loss=0.281, lr=1]\nSteps: 39%|███▉ | 586/1500 [06:20<09:49, 1.55it/s, loss=0.103, lr=1]\nSteps: 39%|███▉ | 587/1500 [06:20<09:48, 1.55it/s, loss=0.103, lr=1]\nSteps: 39%|███▉ | 587/1500 [06:20<09:48, 1.55it/s, loss=0.141, lr=1]\nSteps: 39%|███▉ | 588/1500 [06:21<09:48, 1.55it/s, loss=0.141, lr=1]\nSteps: 39%|███▉ | 588/1500 [06:21<09:48, 1.55it/s, loss=0.109, lr=1]\nSteps: 39%|███▉ | 589/1500 [06:21<09:47, 1.55it/s, loss=0.109, lr=1]\nSteps: 39%|███▉ | 589/1500 [06:21<09:47, 1.55it/s, loss=0.127, lr=1]\nSteps: 39%|███▉ | 590/1500 [06:22<09:46, 1.55it/s, loss=0.127, lr=1]\nSteps: 39%|███▉ | 590/1500 [06:22<09:46, 1.55it/s, loss=0.125, lr=1]\nSteps: 39%|███▉ | 591/1500 [06:23<09:45, 1.55it/s, loss=0.125, lr=1]\nSteps: 39%|███▉ | 591/1500 [06:23<09:45, 1.55it/s, loss=0.0589, lr=1]\nSteps: 39%|███▉ | 592/1500 [06:23<09:45, 1.55it/s, loss=0.0589, lr=1]\nSteps: 39%|███▉ | 592/1500 [06:23<09:45, 1.55it/s, loss=0.251, lr=1] \nSteps: 40%|███▉ | 593/1500 [06:24<09:48, 1.54it/s, loss=0.251, lr=1]\nSteps: 40%|███▉ | 593/1500 [06:24<09:48, 1.54it/s, loss=0.218, lr=1]\nSteps: 40%|███▉ | 594/1500 [06:25<09:47, 1.54it/s, loss=0.218, lr=1]\nSteps: 40%|███▉ | 594/1500 [06:25<09:47, 1.54it/s, loss=0.169, lr=1]\nSteps: 40%|███▉ | 595/1500 [06:25<09:46, 1.54it/s, loss=0.169, lr=1]\nSteps: 40%|███▉ | 595/1500 [06:25<09:46, 1.54it/s, loss=0.0959, lr=1]\nSteps: 40%|███▉ | 596/1500 [06:26<09:45, 1.55it/s, loss=0.0959, lr=1]\nSteps: 40%|███▉ | 596/1500 [06:26<09:45, 1.55it/s, loss=0.135, lr=1] \nSteps: 40%|███▉ | 597/1500 [06:27<09:44, 1.55it/s, loss=0.135, lr=1]\nSteps: 40%|███▉ | 597/1500 [06:27<09:44, 1.55it/s, loss=0.137, lr=1]\nSteps: 40%|███▉ | 598/1500 [06:27<09:43, 1.55it/s, loss=0.137, lr=1]\nSteps: 40%|███▉ | 598/1500 [06:27<09:43, 1.55it/s, loss=0.229, lr=1]\nSteps: 40%|███▉ | 599/1500 [06:28<09:41, 1.55it/s, loss=0.229, lr=1]\nSteps: 40%|███▉ | 599/1500 [06:28<09:41, 1.55it/s, loss=0.0752, lr=1]\nSteps: 40%|████ | 600/1500 [06:29<09:40, 1.55it/s, loss=0.0752, lr=1]\nSteps: 40%|████ | 600/1500 [06:29<09:40, 1.55it/s, loss=0.246, lr=1] \nSteps: 40%|████ | 601/1500 [06:29<09:39, 1.55it/s, loss=0.246, lr=1]\nSteps: 40%|████ | 601/1500 [06:29<09:39, 1.55it/s, loss=0.176, lr=1]\nSteps: 40%|████ | 602/1500 [06:30<09:38, 1.55it/s, loss=0.176, lr=1]\nSteps: 40%|████ | 602/1500 [06:30<09:38, 1.55it/s, loss=0.153, lr=1]\nSteps: 40%|████ | 603/1500 [06:30<09:37, 1.55it/s, loss=0.153, lr=1]\nSteps: 40%|████ | 603/1500 [06:30<09:37, 1.55it/s, loss=0.174, lr=1]\nSteps: 40%|████ | 604/1500 [06:31<09:36, 1.55it/s, loss=0.174, lr=1]\nSteps: 40%|████ | 604/1500 [06:31<09:36, 1.55it/s, loss=0.124, lr=1]\nSteps: 40%|████ | 605/1500 [06:32<09:36, 1.55it/s, loss=0.124, lr=1]\nSteps: 40%|████ | 605/1500 [06:32<09:36, 1.55it/s, loss=0.141, lr=1]\nSteps: 40%|████ | 606/1500 [06:32<09:35, 1.55it/s, loss=0.141, lr=1]\nSteps: 40%|████ | 606/1500 [06:32<09:35, 1.55it/s, loss=0.0681, lr=1]\nSteps: 40%|████ | 607/1500 [06:33<09:34, 1.55it/s, loss=0.0681, lr=1]\nSteps: 40%|████ | 607/1500 [06:33<09:34, 1.55it/s, loss=0.148, lr=1] \nSteps: 41%|████ | 608/1500 [06:34<09:36, 1.55it/s, loss=0.148, lr=1]\nSteps: 41%|████ | 608/1500 [06:34<09:36, 1.55it/s, loss=0.0295, lr=1]\nSteps: 41%|████ | 609/1500 [06:34<09:45, 1.52it/s, loss=0.0295, lr=1]\nSteps: 41%|████ | 609/1500 [06:34<09:45, 1.52it/s, loss=0.127, lr=1] \nSteps: 41%|████ | 610/1500 [06:35<09:42, 1.53it/s, loss=0.127, lr=1]\nSteps: 41%|████ | 610/1500 [06:35<09:42, 1.53it/s, loss=0.213, lr=1]\nSteps: 41%|████ | 611/1500 [06:36<09:38, 1.54it/s, loss=0.213, lr=1]\nSteps: 41%|████ | 611/1500 [06:36<09:38, 1.54it/s, loss=0.0364, lr=1]\nSteps: 41%|████ | 612/1500 [06:36<09:36, 1.54it/s, loss=0.0364, lr=1]\nSteps: 41%|████ | 612/1500 [06:36<09:36, 1.54it/s, loss=0.237, lr=1] \nSteps: 41%|████ | 613/1500 [06:37<09:34, 1.54it/s, loss=0.237, lr=1]\nSteps: 41%|████ | 613/1500 [06:37<09:34, 1.54it/s, loss=0.0965, lr=1]\nSteps: 41%|████ | 614/1500 [06:38<09:33, 1.55it/s, loss=0.0965, lr=1]\nSteps: 41%|████ | 614/1500 [06:38<09:33, 1.55it/s, loss=0.141, lr=1] \nSteps: 41%|████ | 615/1500 [06:38<09:31, 1.55it/s, loss=0.141, lr=1]\nSteps: 41%|████ | 615/1500 [06:38<09:31, 1.55it/s, loss=0.131, lr=1]\nSteps: 41%|████ | 616/1500 [06:39<09:30, 1.55it/s, loss=0.131, lr=1]\nSteps: 41%|████ | 616/1500 [06:39<09:30, 1.55it/s, loss=0.0822, lr=1]\nSteps: 41%|████ | 617/1500 [06:40<09:29, 1.55it/s, loss=0.0822, lr=1]\nSteps: 41%|████ | 617/1500 [06:40<09:29, 1.55it/s, loss=0.108, lr=1] \nSteps: 41%|████ | 618/1500 [06:40<09:28, 1.55it/s, loss=0.108, lr=1]\nSteps: 41%|████ | 618/1500 [06:40<09:28, 1.55it/s, loss=0.0857, lr=1]\nSteps: 41%|████▏ | 619/1500 [06:41<09:28, 1.55it/s, loss=0.0857, lr=1]\nSteps: 41%|████▏ | 619/1500 [06:41<09:28, 1.55it/s, loss=0.0704, lr=1]\nSteps: 41%|████▏ | 620/1500 [06:41<09:27, 1.55it/s, loss=0.0704, lr=1]\nSteps: 41%|████▏ | 620/1500 [06:41<09:27, 1.55it/s, loss=0.0982, lr=1]\nSteps: 41%|████▏ | 621/1500 [06:42<09:26, 1.55it/s, loss=0.0982, lr=1]\nSteps: 41%|████▏ | 621/1500 [06:42<09:26, 1.55it/s, loss=0.145, lr=1] \nSteps: 41%|████▏ | 622/1500 [06:43<09:25, 1.55it/s, loss=0.145, lr=1]\nSteps: 41%|████▏ | 622/1500 [06:43<09:25, 1.55it/s, loss=0.144, lr=1]\nSteps: 42%|████▏ | 623/1500 [06:43<09:25, 1.55it/s, loss=0.144, lr=1]\nSteps: 42%|████▏ | 623/1500 [06:43<09:25, 1.55it/s, loss=0.167, lr=1]\nSteps: 42%|████▏ | 624/1500 [06:44<09:24, 1.55it/s, loss=0.167, lr=1]\nSteps: 42%|████▏ | 624/1500 [06:44<09:24, 1.55it/s, loss=0.119, lr=1]\nSteps: 42%|████▏ | 625/1500 [06:45<09:27, 1.54it/s, loss=0.119, lr=1]\nSteps: 42%|████▏ | 625/1500 [06:45<09:27, 1.54it/s, loss=0.0489, lr=1]\nSteps: 42%|████▏ | 626/1500 [06:45<09:25, 1.54it/s, loss=0.0489, lr=1]\nSteps: 42%|████▏ | 626/1500 [06:45<09:25, 1.54it/s, loss=0.0774, lr=1]\nSteps: 42%|████▏ | 627/1500 [06:46<09:24, 1.55it/s, loss=0.0774, lr=1]\nSteps: 42%|████▏ | 627/1500 [06:46<09:24, 1.55it/s, loss=0.1, lr=1] \nSteps: 42%|████▏ | 628/1500 [06:47<09:24, 1.54it/s, loss=0.1, lr=1]\nSteps: 42%|████▏ | 628/1500 [06:47<09:24, 1.54it/s, loss=0.13, lr=1]\nSteps: 42%|████▏ | 629/1500 [06:47<09:24, 1.54it/s, loss=0.13, lr=1]\nSteps: 42%|████▏ | 629/1500 [06:47<09:24, 1.54it/s, loss=0.0711, lr=1]\nSteps: 42%|████▏ | 630/1500 [06:48<09:24, 1.54it/s, loss=0.0711, lr=1]\nSteps: 42%|████▏ | 630/1500 [06:48<09:24, 1.54it/s, loss=0.118, lr=1] \nSteps: 42%|████▏ | 631/1500 [06:49<09:22, 1.54it/s, loss=0.118, lr=1]\nSteps: 42%|████▏ | 631/1500 [06:49<09:22, 1.54it/s, loss=0.109, lr=1]\nSteps: 42%|████▏ | 632/1500 [06:49<09:21, 1.55it/s, loss=0.109, lr=1]\nSteps: 42%|████▏ | 632/1500 [06:49<09:21, 1.55it/s, loss=0.0255, lr=1]\nSteps: 42%|████▏ | 633/1500 [06:50<09:20, 1.55it/s, loss=0.0255, lr=1]\nSteps: 42%|████▏ | 633/1500 [06:50<09:20, 1.55it/s, loss=0.0847, lr=1]\nSteps: 42%|████▏ | 634/1500 [06:51<09:20, 1.55it/s, loss=0.0847, lr=1]\nSteps: 42%|████▏ | 634/1500 [06:51<09:20, 1.55it/s, loss=0.141, lr=1] \nSteps: 42%|████▏ | 635/1500 [06:51<09:18, 1.55it/s, loss=0.141, lr=1]\nSteps: 42%|████▏ | 635/1500 [06:51<09:18, 1.55it/s, loss=0.167, lr=1]\nSteps: 42%|████▏ | 636/1500 [06:52<09:17, 1.55it/s, loss=0.167, lr=1]\nSteps: 42%|████▏ | 636/1500 [06:52<09:17, 1.55it/s, loss=0.0701, lr=1]\nSteps: 42%|████▏ | 637/1500 [06:52<09:16, 1.55it/s, loss=0.0701, lr=1]\nSteps: 42%|████▏ | 637/1500 [06:52<09:16, 1.55it/s, loss=0.161, lr=1] \nSteps: 43%|████▎ | 638/1500 [06:53<09:15, 1.55it/s, loss=0.161, lr=1]\nSteps: 43%|████▎ | 638/1500 [06:53<09:15, 1.55it/s, loss=0.22, lr=1] \nSteps: 43%|████▎ | 639/1500 [06:54<09:16, 1.55it/s, loss=0.22, lr=1]\nSteps: 43%|████▎ | 639/1500 [06:54<09:16, 1.55it/s, loss=0.0763, lr=1]\nSteps: 43%|████▎ | 640/1500 [06:54<09:14, 1.55it/s, loss=0.0763, lr=1]\nSteps: 43%|████▎ | 640/1500 [06:54<09:14, 1.55it/s, loss=0.125, lr=1] \nSteps: 43%|████▎ | 641/1500 [06:55<09:17, 1.54it/s, loss=0.125, lr=1]\nSteps: 43%|████▎ | 641/1500 [06:55<09:17, 1.54it/s, loss=0.15, lr=1] \nSteps: 43%|████▎ | 642/1500 [06:56<09:16, 1.54it/s, loss=0.15, lr=1]\nSteps: 43%|████▎ | 642/1500 [06:56<09:16, 1.54it/s, loss=0.112, lr=1]\nSteps: 43%|████▎ | 643/1500 [06:56<09:15, 1.54it/s, loss=0.112, lr=1]\nSteps: 43%|████▎ | 643/1500 [06:56<09:15, 1.54it/s, loss=0.146, lr=1]\nSteps: 43%|████▎ | 644/1500 [06:57<09:14, 1.54it/s, loss=0.146, lr=1]\nSteps: 43%|████▎ | 644/1500 [06:57<09:14, 1.54it/s, loss=0.0801, lr=1]\nSteps: 43%|████▎ | 645/1500 [06:58<09:12, 1.55it/s, loss=0.0801, lr=1]\nSteps: 43%|████▎ | 645/1500 [06:58<09:12, 1.55it/s, loss=0.0814, lr=1]\nSteps: 43%|████▎ | 646/1500 [06:58<09:12, 1.55it/s, loss=0.0814, lr=1]\nSteps: 43%|████▎ | 646/1500 [06:58<09:12, 1.55it/s, loss=0.0662, lr=1]\nSteps: 43%|████▎ | 647/1500 [06:59<09:12, 1.54it/s, loss=0.0662, lr=1]\nSteps: 43%|████▎ | 647/1500 [06:59<09:12, 1.54it/s, loss=0.161, lr=1] \nSteps: 43%|████▎ | 648/1500 [07:00<09:16, 1.53it/s, loss=0.161, lr=1]\nSteps: 43%|████▎ | 648/1500 [07:00<09:16, 1.53it/s, loss=0.149, lr=1]\nSteps: 43%|████▎ | 649/1500 [07:00<09:14, 1.53it/s, loss=0.149, lr=1]\nSteps: 43%|████▎ | 649/1500 [07:00<09:14, 1.53it/s, loss=0.0361, lr=1]\nSteps: 43%|████▎ | 650/1500 [07:01<09:13, 1.54it/s, loss=0.0361, lr=1]\nSteps: 43%|████▎ | 650/1500 [07:01<09:13, 1.54it/s, loss=0.193, lr=1] \nSteps: 43%|████▎ | 651/1500 [07:02<09:11, 1.54it/s, loss=0.193, lr=1]\nSteps: 43%|████▎ | 651/1500 [07:02<09:11, 1.54it/s, loss=0.13, lr=1] \nSteps: 43%|████▎ | 652/1500 [07:02<09:09, 1.54it/s, loss=0.13, lr=1]\nSteps: 43%|████▎ | 652/1500 [07:02<09:09, 1.54it/s, loss=0.118, lr=1]\nSteps: 44%|████▎ | 653/1500 [07:03<09:08, 1.55it/s, loss=0.118, lr=1]\nSteps: 44%|████▎ | 653/1500 [07:03<09:08, 1.55it/s, loss=0.0854, lr=1]\nSteps: 44%|████▎ | 654/1500 [07:04<09:06, 1.55it/s, loss=0.0854, lr=1]\nSteps: 44%|████▎ | 654/1500 [07:04<09:06, 1.55it/s, loss=0.107, lr=1] \nSteps: 44%|████▎ | 655/1500 [07:04<09:09, 1.54it/s, loss=0.107, lr=1]\nSteps: 44%|████▎ | 655/1500 [07:04<09:09, 1.54it/s, loss=0.0942, lr=1]\nSteps: 44%|████▎ | 656/1500 [07:05<09:14, 1.52it/s, loss=0.0942, lr=1]\nSteps: 44%|████▎ | 656/1500 [07:05<09:14, 1.52it/s, loss=0.0586, lr=1]\nSteps: 44%|████▍ | 657/1500 [07:05<09:13, 1.52it/s, loss=0.0586, lr=1]\nSteps: 44%|████▍ | 657/1500 [07:05<09:13, 1.52it/s, loss=0.169, lr=1] \nSteps: 44%|████▍ | 658/1500 [07:06<09:10, 1.53it/s, loss=0.169, lr=1]\nSteps: 44%|████▍ | 658/1500 [07:06<09:10, 1.53it/s, loss=0.0382, lr=1]\nSteps: 44%|████▍ | 659/1500 [07:07<09:10, 1.53it/s, loss=0.0382, lr=1]\nSteps: 44%|████▍ | 659/1500 [07:07<09:10, 1.53it/s, loss=0.106, lr=1] \nSteps: 44%|████▍ | 660/1500 [07:07<09:07, 1.53it/s, loss=0.106, lr=1]\nSteps: 44%|████▍ | 660/1500 [07:07<09:07, 1.53it/s, loss=0.244, lr=1]\nSteps: 44%|████▍ | 661/1500 [07:08<09:05, 1.54it/s, loss=0.244, lr=1]\nSteps: 44%|████▍ | 661/1500 [07:08<09:05, 1.54it/s, loss=0.0492, lr=1]\nSteps: 44%|████▍ | 662/1500 [07:09<09:04, 1.54it/s, loss=0.0492, lr=1]\nSteps: 44%|████▍ | 662/1500 [07:09<09:04, 1.54it/s, loss=0.26, lr=1] \nSteps: 44%|████▍ | 663/1500 [07:09<09:03, 1.54it/s, loss=0.26, lr=1]\nSteps: 44%|████▍ | 663/1500 [07:09<09:03, 1.54it/s, loss=0.0834, lr=1]\nSteps: 44%|████▍ | 664/1500 [07:10<09:02, 1.54it/s, loss=0.0834, lr=1]\nSteps: 44%|████▍ | 664/1500 [07:10<09:02, 1.54it/s, loss=0.0897, lr=1]\nSteps: 44%|████▍ | 665/1500 [07:11<09:01, 1.54it/s, loss=0.0897, lr=1]\nSteps: 44%|████▍ | 665/1500 [07:11<09:01, 1.54it/s, loss=0.105, lr=1] \nSteps: 44%|████▍ | 666/1500 [07:11<08:59, 1.54it/s, loss=0.105, lr=1]\nSteps: 44%|████▍ | 666/1500 [07:11<08:59, 1.54it/s, loss=0.156, lr=1]\nSteps: 44%|████▍ | 667/1500 [07:12<08:58, 1.55it/s, loss=0.156, lr=1]\nSteps: 44%|████▍ | 667/1500 [07:12<08:58, 1.55it/s, loss=0.183, lr=1]\nSteps: 45%|████▍ | 668/1500 [07:13<08:58, 1.54it/s, loss=0.183, lr=1]\nSteps: 45%|████▍ | 668/1500 [07:13<08:58, 1.54it/s, loss=0.275, lr=1]\nSteps: 45%|████▍ | 669/1500 [07:13<08:57, 1.54it/s, loss=0.275, lr=1]\nSteps: 45%|████▍ | 669/1500 [07:13<08:57, 1.54it/s, loss=0.0672, lr=1]\nSteps: 45%|████▍ | 670/1500 [07:14<08:57, 1.54it/s, loss=0.0672, lr=1]\nSteps: 45%|████▍ | 670/1500 [07:14<08:57, 1.54it/s, loss=0.126, lr=1] \nSteps: 45%|████▍ | 671/1500 [07:15<08:56, 1.54it/s, loss=0.126, lr=1]\nSteps: 45%|████▍ | 671/1500 [07:15<08:56, 1.54it/s, loss=0.116, lr=1]\nSteps: 45%|████▍ | 672/1500 [07:15<08:55, 1.55it/s, loss=0.116, lr=1]\nSteps: 45%|████▍ | 672/1500 [07:15<08:55, 1.55it/s, loss=0.116, lr=1]\nSteps: 45%|████▍ | 673/1500 [07:16<08:57, 1.54it/s, loss=0.116, lr=1]\nSteps: 45%|████▍ | 673/1500 [07:16<08:57, 1.54it/s, loss=0.106, lr=1]\nSteps: 45%|████▍ | 674/1500 [07:17<08:55, 1.54it/s, loss=0.106, lr=1]\nSteps: 45%|████▍ | 674/1500 [07:17<08:55, 1.54it/s, loss=0.239, lr=1]\nSteps: 45%|████▌ | 675/1500 [07:17<08:53, 1.55it/s, loss=0.239, lr=1]\nSteps: 45%|████▌ | 675/1500 [07:17<08:53, 1.55it/s, loss=0.0977, lr=1]\nSteps: 45%|████▌ | 676/1500 [07:18<08:52, 1.55it/s, loss=0.0977, lr=1]\nSteps: 45%|████▌ | 676/1500 [07:18<08:52, 1.55it/s, loss=0.226, lr=1] \nSteps: 45%|████▌ | 677/1500 [07:18<08:51, 1.55it/s, loss=0.226, lr=1]\nSteps: 45%|████▌ | 677/1500 [07:18<08:51, 1.55it/s, loss=0.0782, lr=1]\nSteps: 45%|████▌ | 678/1500 [07:19<08:50, 1.55it/s, loss=0.0782, lr=1]\nSteps: 45%|████▌ | 678/1500 [07:19<08:50, 1.55it/s, loss=0.0763, lr=1]\nSteps: 45%|████▌ | 679/1500 [07:20<08:50, 1.55it/s, loss=0.0763, lr=1]\nSteps: 45%|████▌ | 679/1500 [07:20<08:50, 1.55it/s, loss=0.144, lr=1] \nSteps: 45%|████▌ | 680/1500 [07:20<08:50, 1.54it/s, loss=0.144, lr=1]\nSteps: 45%|████▌ | 680/1500 [07:20<08:50, 1.54it/s, loss=0.0528, lr=1]\nSteps: 45%|████▌ | 681/1500 [07:21<08:50, 1.54it/s, loss=0.0528, lr=1]\nSteps: 45%|████▌ | 681/1500 [07:21<08:50, 1.54it/s, loss=0.135, lr=1] \nSteps: 45%|████▌ | 682/1500 [07:22<08:50, 1.54it/s, loss=0.135, lr=1]\nSteps: 45%|████▌ | 682/1500 [07:22<08:50, 1.54it/s, loss=0.193, lr=1]\nSteps: 46%|████▌ | 683/1500 [07:22<08:49, 1.54it/s, loss=0.193, lr=1]\nSteps: 46%|████▌ | 683/1500 [07:22<08:49, 1.54it/s, loss=0.153, lr=1]\nSteps: 46%|████▌ | 684/1500 [07:23<08:48, 1.54it/s, loss=0.153, lr=1]\nSteps: 46%|████▌ | 684/1500 [07:23<08:48, 1.54it/s, loss=0.131, lr=1]\nSteps: 46%|████▌ | 685/1500 [07:24<08:48, 1.54it/s, loss=0.131, lr=1]\nSteps: 46%|████▌ | 685/1500 [07:24<08:48, 1.54it/s, loss=0.0853, lr=1]\nSteps: 46%|████▌ | 686/1500 [07:24<08:47, 1.54it/s, loss=0.0853, lr=1]\nSteps: 46%|████▌ | 686/1500 [07:24<08:47, 1.54it/s, loss=0.106, lr=1] \nSteps: 46%|████▌ | 687/1500 [07:25<08:46, 1.54it/s, loss=0.106, lr=1]\nSteps: 46%|████▌ | 687/1500 [07:25<08:46, 1.54it/s, loss=0.259, lr=1]\nSteps: 46%|████▌ | 688/1500 [07:26<08:44, 1.55it/s, loss=0.259, lr=1]\nSteps: 46%|████▌ | 688/1500 [07:26<08:44, 1.55it/s, loss=0.163, lr=1]\nSteps: 46%|████▌ | 689/1500 [07:26<08:48, 1.53it/s, loss=0.163, lr=1]\nSteps: 46%|████▌ | 689/1500 [07:26<08:48, 1.53it/s, loss=0.0191, lr=1]\nSteps: 46%|████▌ | 690/1500 [07:27<08:46, 1.54it/s, loss=0.0191, lr=1]\nSteps: 46%|████▌ | 690/1500 [07:27<08:46, 1.54it/s, loss=0.00796, lr=1]\nSteps: 46%|████▌ | 691/1500 [07:28<08:44, 1.54it/s, loss=0.00796, lr=1]\nSteps: 46%|████▌ | 691/1500 [07:28<08:44, 1.54it/s, loss=0.154, lr=1] \nSteps: 46%|████▌ | 692/1500 [07:28<08:42, 1.55it/s, loss=0.154, lr=1]\nSteps: 46%|████▌ | 692/1500 [07:28<08:42, 1.55it/s, loss=0.0745, lr=1]\nSteps: 46%|████▌ | 693/1500 [07:29<08:41, 1.55it/s, loss=0.0745, lr=1]\nSteps: 46%|████▌ | 693/1500 [07:29<08:41, 1.55it/s, loss=0.159, lr=1] \nSteps: 46%|████▋ | 694/1500 [07:29<08:40, 1.55it/s, loss=0.159, lr=1]\nSteps: 46%|████▋ | 694/1500 [07:29<08:40, 1.55it/s, loss=0.148, lr=1]\nSteps: 46%|████▋ | 695/1500 [07:30<08:39, 1.55it/s, loss=0.148, lr=1]\nSteps: 46%|████▋ | 695/1500 [07:30<08:39, 1.55it/s, loss=0.233, lr=1]\nSteps: 46%|████▋ | 696/1500 [07:31<08:38, 1.55it/s, loss=0.233, lr=1]\nSteps: 46%|████▋ | 696/1500 [07:31<08:38, 1.55it/s, loss=0.119, lr=1]\nSteps: 46%|████▋ | 697/1500 [07:31<08:37, 1.55it/s, loss=0.119, lr=1]\nSteps: 46%|████▋ | 697/1500 [07:31<08:37, 1.55it/s, loss=0.111, lr=1]\nSteps: 47%|████▋ | 698/1500 [07:32<08:37, 1.55it/s, loss=0.111, lr=1]\nSteps: 47%|████▋ | 698/1500 [07:32<08:37, 1.55it/s, loss=0.206, lr=1]\nSteps: 47%|████▋ | 699/1500 [07:33<08:36, 1.55it/s, loss=0.206, lr=1]\nSteps: 47%|████▋ | 699/1500 [07:33<08:36, 1.55it/s, loss=0.154, lr=1]\nSteps: 47%|████▋ | 700/1500 [07:33<08:35, 1.55it/s, loss=0.154, lr=1]\nSteps: 47%|████▋ | 700/1500 [07:33<08:35, 1.55it/s, loss=0.246, lr=1]\nSteps: 47%|████▋ | 701/1500 [07:34<08:37, 1.54it/s, loss=0.246, lr=1]\nSteps: 47%|████▋ | 701/1500 [07:34<08:37, 1.54it/s, loss=0.025, lr=1]\nSteps: 47%|████▋ | 702/1500 [07:35<08:35, 1.55it/s, loss=0.025, lr=1]\nSteps: 47%|████▋ | 702/1500 [07:35<08:35, 1.55it/s, loss=0.161, lr=1]\nSteps: 47%|████▋ | 703/1500 [07:35<08:34, 1.55it/s, loss=0.161, lr=1]\nSteps: 47%|████▋ | 703/1500 [07:35<08:34, 1.55it/s, loss=0.0645, lr=1]\nSteps: 47%|████▋ | 704/1500 [07:36<08:33, 1.55it/s, loss=0.0645, lr=1]\nSteps: 47%|████▋ | 704/1500 [07:36<08:33, 1.55it/s, loss=0.135, lr=1] \nSteps: 47%|████▋ | 705/1500 [07:37<08:36, 1.54it/s, loss=0.135, lr=1]\nSteps: 47%|████▋ | 705/1500 [07:37<08:36, 1.54it/s, loss=0.0357, lr=1]\nSteps: 47%|████▋ | 706/1500 [07:37<08:34, 1.54it/s, loss=0.0357, lr=1]\nSteps: 47%|████▋ | 706/1500 [07:37<08:34, 1.54it/s, loss=0.141, lr=1] \nSteps: 47%|████▋ | 707/1500 [07:38<08:33, 1.55it/s, loss=0.141, lr=1]\nSteps: 47%|████▋ | 707/1500 [07:38<08:33, 1.55it/s, loss=0.0497, lr=1]\nSteps: 47%|████▋ | 708/1500 [07:39<08:31, 1.55it/s, loss=0.0497, lr=1]\nSteps: 47%|████▋ | 708/1500 [07:39<08:31, 1.55it/s, loss=0.136, lr=1] \nSteps: 47%|████▋ | 709/1500 [07:39<08:30, 1.55it/s, loss=0.136, lr=1]\nSteps: 47%|████▋ | 709/1500 [07:39<08:30, 1.55it/s, loss=0.0996, lr=1]\nSteps: 47%|████▋ | 710/1500 [07:40<08:29, 1.55it/s, loss=0.0996, lr=1]\nSteps: 47%|████▋ | 710/1500 [07:40<08:29, 1.55it/s, loss=0.0778, lr=1]\nSteps: 47%|████▋ | 711/1500 [07:40<08:28, 1.55it/s, loss=0.0778, lr=1]\nSteps: 47%|████▋ | 711/1500 [07:40<08:28, 1.55it/s, loss=0.118, lr=1] \nSteps: 47%|████▋ | 712/1500 [07:41<08:28, 1.55it/s, loss=0.118, lr=1]\nSteps: 47%|████▋ | 712/1500 [07:41<08:28, 1.55it/s, loss=0.131, lr=1]\nSteps: 48%|████▊ | 713/1500 [07:42<08:27, 1.55it/s, loss=0.131, lr=1]\nSteps: 48%|████▊ | 713/1500 [07:42<08:27, 1.55it/s, loss=0.13, lr=1] \nSteps: 48%|████▊ | 714/1500 [07:42<08:26, 1.55it/s, loss=0.13, lr=1]\nSteps: 48%|████▊ | 714/1500 [07:42<08:26, 1.55it/s, loss=0.141, lr=1]\nSteps: 48%|████▊ | 715/1500 [07:43<08:25, 1.55it/s, loss=0.141, lr=1]\nSteps: 48%|████▊ | 715/1500 [07:43<08:25, 1.55it/s, loss=0.0741, lr=1]\nSteps: 48%|████▊ | 716/1500 [07:44<08:26, 1.55it/s, loss=0.0741, lr=1]\nSteps: 48%|████▊ | 716/1500 [07:44<08:26, 1.55it/s, loss=0.0596, lr=1]\nSteps: 48%|████▊ | 717/1500 [07:44<08:25, 1.55it/s, loss=0.0596, lr=1]\nSteps: 48%|████▊ | 717/1500 [07:44<08:25, 1.55it/s, loss=0.051, lr=1] \nSteps: 48%|████▊ | 718/1500 [07:45<08:24, 1.55it/s, loss=0.051, lr=1]\nSteps: 48%|████▊ | 718/1500 [07:45<08:24, 1.55it/s, loss=0.0879, lr=1]\nSteps: 48%|████▊ | 719/1500 [07:46<08:23, 1.55it/s, loss=0.0879, lr=1]\nSteps: 48%|████▊ | 719/1500 [07:46<08:23, 1.55it/s, loss=0.158, lr=1] \nSteps: 48%|████▊ | 720/1500 [07:46<08:23, 1.55it/s, loss=0.158, lr=1]\nSteps: 48%|████▊ | 720/1500 [07:46<08:23, 1.55it/s, loss=0.11, lr=1] \nSteps: 48%|████▊ | 721/1500 [07:47<08:25, 1.54it/s, loss=0.11, lr=1]\nSteps: 48%|████▊ | 721/1500 [07:47<08:25, 1.54it/s, loss=0.0909, lr=1]\nSteps: 48%|████▊ | 722/1500 [07:48<08:23, 1.55it/s, loss=0.0909, lr=1]\nSteps: 48%|████▊ | 722/1500 [07:48<08:23, 1.55it/s, loss=0.224, lr=1] \nSteps: 48%|████▊ | 723/1500 [07:48<08:21, 1.55it/s, loss=0.224, lr=1]\nSteps: 48%|████▊ | 723/1500 [07:48<08:21, 1.55it/s, loss=0.118, lr=1]\nSteps: 48%|████▊ | 724/1500 [07:49<08:20, 1.55it/s, loss=0.118, lr=1]\nSteps: 48%|████▊ | 724/1500 [07:49<08:20, 1.55it/s, loss=0.153, lr=1]\nSteps: 48%|████▊ | 725/1500 [07:49<08:20, 1.55it/s, loss=0.153, lr=1]\nSteps: 48%|████▊ | 725/1500 [07:49<08:20, 1.55it/s, loss=0.174, lr=1]\nSteps: 48%|████▊ | 726/1500 [07:50<08:19, 1.55it/s, loss=0.174, lr=1]\nSteps: 48%|████▊ | 726/1500 [07:50<08:19, 1.55it/s, loss=0.0981, lr=1]\nSteps: 48%|████▊ | 727/1500 [07:51<08:18, 1.55it/s, loss=0.0981, lr=1]\nSteps: 48%|████▊ | 727/1500 [07:51<08:18, 1.55it/s, loss=0.149, lr=1] \nSteps: 49%|████▊ | 728/1500 [07:51<08:17, 1.55it/s, loss=0.149, lr=1]\nSteps: 49%|████▊ | 728/1500 [07:51<08:17, 1.55it/s, loss=0.0629, lr=1]\nSteps: 49%|████▊ | 729/1500 [07:52<08:17, 1.55it/s, loss=0.0629, lr=1]\nSteps: 49%|████▊ | 729/1500 [07:52<08:17, 1.55it/s, loss=0.0951, lr=1]\nSteps: 49%|████▊ | 730/1500 [07:53<08:16, 1.55it/s, loss=0.0951, lr=1]\nSteps: 49%|████▊ | 730/1500 [07:53<08:16, 1.55it/s, loss=0.0587, lr=1]\nSteps: 49%|████▊ | 731/1500 [07:53<08:15, 1.55it/s, loss=0.0587, lr=1]\nSteps: 49%|████▊ | 731/1500 [07:53<08:15, 1.55it/s, loss=0.204, lr=1] \nSteps: 49%|████▉ | 732/1500 [07:54<08:15, 1.55it/s, loss=0.204, lr=1]\nSteps: 49%|████▉ | 732/1500 [07:54<08:15, 1.55it/s, loss=0.114, lr=1]\nSteps: 49%|████▉ | 733/1500 [07:55<08:15, 1.55it/s, loss=0.114, lr=1]\nSteps: 49%|████▉ | 733/1500 [07:55<08:15, 1.55it/s, loss=0.282, lr=1]\nSteps: 49%|████▉ | 734/1500 [07:55<08:14, 1.55it/s, loss=0.282, lr=1]\nSteps: 49%|████▉ | 734/1500 [07:55<08:14, 1.55it/s, loss=0.166, lr=1]\nSteps: 49%|████▉ | 735/1500 [07:56<08:13, 1.55it/s, loss=0.166, lr=1]\nSteps: 49%|████▉ | 735/1500 [07:56<08:13, 1.55it/s, loss=0.139, lr=1]\nSteps: 49%|████▉ | 736/1500 [07:57<08:12, 1.55it/s, loss=0.139, lr=1]\nSteps: 49%|████▉ | 736/1500 [07:57<08:12, 1.55it/s, loss=0.101, lr=1]\nSteps: 49%|████▉ | 737/1500 [07:57<08:14, 1.54it/s, loss=0.101, lr=1]\nSteps: 49%|████▉ | 737/1500 [07:57<08:14, 1.54it/s, loss=0.103, lr=1]\nSteps: 49%|████▉ | 738/1500 [07:58<08:13, 1.54it/s, loss=0.103, lr=1]\nSteps: 49%|████▉ | 738/1500 [07:58<08:13, 1.54it/s, loss=0.216, lr=1]\nSteps: 49%|████▉ | 739/1500 [07:59<08:12, 1.54it/s, loss=0.216, lr=1]\nSteps: 49%|████▉ | 739/1500 [07:59<08:12, 1.54it/s, loss=0.166, lr=1]\nSteps: 49%|████▉ | 740/1500 [07:59<08:11, 1.55it/s, loss=0.166, lr=1]\nSteps: 49%|████▉ | 740/1500 [07:59<08:11, 1.55it/s, loss=0.0985, lr=1]\nSteps: 49%|████▉ | 741/1500 [08:00<08:10, 1.55it/s, loss=0.0985, lr=1]\nSteps: 49%|████▉ | 741/1500 [08:00<08:10, 1.55it/s, loss=0.184, lr=1] \nSteps: 49%|████▉ | 742/1500 [08:00<08:09, 1.55it/s, loss=0.184, lr=1]\nSteps: 49%|████▉ | 742/1500 [08:00<08:09, 1.55it/s, loss=0.0529, lr=1]\nSteps: 50%|████▉ | 743/1500 [08:01<08:08, 1.55it/s, loss=0.0529, lr=1]\nSteps: 50%|████▉ | 743/1500 [08:01<08:08, 1.55it/s, loss=0.0954, lr=1]\nSteps: 50%|████▉ | 744/1500 [08:02<08:07, 1.55it/s, loss=0.0954, lr=1]\nSteps: 50%|████▉ | 744/1500 [08:02<08:07, 1.55it/s, loss=0.126, lr=1] \nSteps: 50%|████▉ | 745/1500 [08:02<08:06, 1.55it/s, loss=0.126, lr=1]\nSteps: 50%|████▉ | 745/1500 [08:02<08:06, 1.55it/s, loss=0.162, lr=1]\nSteps: 50%|████▉ | 746/1500 [08:03<08:05, 1.55it/s, loss=0.162, lr=1]\nSteps: 50%|████▉ | 746/1500 [08:03<08:05, 1.55it/s, loss=0.103, lr=1]\nSteps: 50%|████▉ | 747/1500 [08:04<08:08, 1.54it/s, loss=0.103, lr=1]\nSteps: 50%|████▉ | 747/1500 [08:04<08:08, 1.54it/s, loss=0.161, lr=1]\nSteps: 50%|████▉ | 748/1500 [08:04<08:08, 1.54it/s, loss=0.161, lr=1]\nSteps: 50%|████▉ | 748/1500 [08:04<08:08, 1.54it/s, loss=0.132, lr=1]\nSteps: 50%|████▉ | 749/1500 [08:05<08:06, 1.54it/s, loss=0.132, lr=1]\nSteps: 50%|████▉ | 749/1500 [08:05<08:06, 1.54it/s, loss=0.119, lr=1]\nSteps: 50%|█████ | 750/1500 [08:06<08:04, 1.55it/s, loss=0.119, lr=1]\nSteps: 50%|█████ | 750/1500 [08:06<08:04, 1.55it/s, loss=0.0234, lr=1]\nSteps: 50%|█████ | 751/1500 [08:06<08:03, 1.55it/s, loss=0.0234, lr=1]\nSteps: 50%|█████ | 751/1500 [08:06<08:03, 1.55it/s, loss=0.153, lr=1] \nSteps: 50%|█████ | 752/1500 [08:07<08:03, 1.55it/s, loss=0.153, lr=1]\nSteps: 50%|█████ | 752/1500 [08:07<08:03, 1.55it/s, loss=0.039, lr=1]\nSteps: 50%|█████ | 753/1500 [08:08<08:05, 1.54it/s, loss=0.039, lr=1]\nSteps: 50%|█████ | 753/1500 [08:08<08:05, 1.54it/s, loss=0.17, lr=1] \nSteps: 50%|█████ | 754/1500 [08:08<08:03, 1.54it/s, loss=0.17, lr=1]\nSteps: 50%|█████ | 754/1500 [08:08<08:03, 1.54it/s, loss=0.228, lr=1]\nSteps: 50%|█████ | 755/1500 [08:09<08:02, 1.54it/s, loss=0.228, lr=1]\nSteps: 50%|█████ | 755/1500 [08:09<08:02, 1.54it/s, loss=0.18, lr=1] \nSteps: 50%|█████ | 756/1500 [08:10<08:01, 1.55it/s, loss=0.18, lr=1]\nSteps: 50%|█████ | 756/1500 [08:10<08:01, 1.55it/s, loss=0.0736, lr=1]\nSteps: 50%|█████ | 757/1500 [08:10<08:00, 1.55it/s, loss=0.0736, lr=1]\nSteps: 50%|█████ | 757/1500 [08:10<08:00, 1.55it/s, loss=0.139, lr=1] \nSteps: 51%|█████ | 758/1500 [08:11<07:59, 1.55it/s, loss=0.139, lr=1]\nSteps: 51%|█████ | 758/1500 [08:11<07:59, 1.55it/s, loss=0.0602, lr=1]\nSteps: 51%|█████ | 759/1500 [08:11<07:58, 1.55it/s, loss=0.0602, lr=1]\nSteps: 51%|█████ | 759/1500 [08:11<07:58, 1.55it/s, loss=0.039, lr=1] \nSteps: 51%|█████ | 760/1500 [08:12<07:57, 1.55it/s, loss=0.039, lr=1]\nSteps: 51%|█████ | 760/1500 [08:12<07:57, 1.55it/s, loss=0.136, lr=1]\nSteps: 51%|█████ | 761/1500 [08:13<07:56, 1.55it/s, loss=0.136, lr=1]\nSteps: 51%|█████ | 761/1500 [08:13<07:56, 1.55it/s, loss=0.179, lr=1]\nSteps: 51%|█████ | 762/1500 [08:13<07:55, 1.55it/s, loss=0.179, lr=1]\nSteps: 51%|█████ | 762/1500 [08:13<07:55, 1.55it/s, loss=0.137, lr=1]\nSteps: 51%|█████ | 763/1500 [08:14<07:55, 1.55it/s, loss=0.137, lr=1]\nSteps: 51%|█████ | 763/1500 [08:14<07:55, 1.55it/s, loss=0.187, lr=1]\nSteps: 51%|█████ | 764/1500 [08:15<07:54, 1.55it/s, loss=0.187, lr=1]\nSteps: 51%|█████ | 764/1500 [08:15<07:54, 1.55it/s, loss=0.109, lr=1]\nSteps: 51%|█████ | 765/1500 [08:15<07:53, 1.55it/s, loss=0.109, lr=1]\nSteps: 51%|█████ | 765/1500 [08:15<07:53, 1.55it/s, loss=0.125, lr=1]\nSteps: 51%|█████ | 766/1500 [08:16<07:53, 1.55it/s, loss=0.125, lr=1]\nSteps: 51%|█████ | 766/1500 [08:16<07:53, 1.55it/s, loss=0.0844, lr=1]\nSteps: 51%|█████ | 767/1500 [08:17<07:53, 1.55it/s, loss=0.0844, lr=1]\nSteps: 51%|█████ | 767/1500 [08:17<07:53, 1.55it/s, loss=0.133, lr=1] \nSteps: 51%|█████ | 768/1500 [08:17<07:52, 1.55it/s, loss=0.133, lr=1]\nSteps: 51%|█████ | 768/1500 [08:17<07:52, 1.55it/s, loss=0.167, lr=1]\nSteps: 51%|█████▏ | 769/1500 [08:18<07:54, 1.54it/s, loss=0.167, lr=1]\nSteps: 51%|█████▏ | 769/1500 [08:18<07:54, 1.54it/s, loss=0.092, lr=1]\nSteps: 51%|█████▏ | 770/1500 [08:19<07:52, 1.54it/s, loss=0.092, lr=1]\nSteps: 51%|█████▏ | 770/1500 [08:19<07:52, 1.54it/s, loss=0.25, lr=1] \nSteps: 51%|█████▏ | 771/1500 [08:19<07:51, 1.54it/s, loss=0.25, lr=1]\nSteps: 51%|█████▏ | 771/1500 [08:19<07:51, 1.54it/s, loss=0.225, lr=1]\nSteps: 51%|█████▏ | 772/1500 [08:20<07:50, 1.55it/s, loss=0.225, lr=1]\nSteps: 51%|█████▏ | 772/1500 [08:20<07:50, 1.55it/s, loss=0.0841, lr=1]\nSteps: 52%|█████▏ | 773/1500 [08:20<07:50, 1.55it/s, loss=0.0841, lr=1]\nSteps: 52%|█████▏ | 773/1500 [08:20<07:50, 1.55it/s, loss=0.0806, lr=1]\nSteps: 52%|█████▏ | 774/1500 [08:21<07:48, 1.55it/s, loss=0.0806, lr=1]\nSteps: 52%|█████▏ | 774/1500 [08:21<07:48, 1.55it/s, loss=0.0689, lr=1]\nSteps: 52%|█████▏ | 775/1500 [08:22<07:47, 1.55it/s, loss=0.0689, lr=1]\nSteps: 52%|█████▏ | 775/1500 [08:22<07:47, 1.55it/s, loss=0.0753, lr=1]\nSteps: 52%|█████▏ | 776/1500 [08:22<07:46, 1.55it/s, loss=0.0753, lr=1]\nSteps: 52%|█████▏ | 776/1500 [08:22<07:46, 1.55it/s, loss=0.0451, lr=1]\nSteps: 52%|█████▏ | 777/1500 [08:23<07:45, 1.55it/s, loss=0.0451, lr=1]\nSteps: 52%|█████▏ | 777/1500 [08:23<07:45, 1.55it/s, loss=0.213, lr=1] \nSteps: 52%|█████▏ | 778/1500 [08:24<07:45, 1.55it/s, loss=0.213, lr=1]\nSteps: 52%|█████▏ | 778/1500 [08:24<07:45, 1.55it/s, loss=0.143, lr=1]\nSteps: 52%|█████▏ | 779/1500 [08:24<07:45, 1.55it/s, loss=0.143, lr=1]\nSteps: 52%|█████▏ | 779/1500 [08:24<07:45, 1.55it/s, loss=0.222, lr=1]\nSteps: 52%|█████▏ | 780/1500 [08:25<07:44, 1.55it/s, loss=0.222, lr=1]\nSteps: 52%|█████▏ | 780/1500 [08:25<07:44, 1.55it/s, loss=0.0748, lr=1]\nSteps: 52%|█████▏ | 781/1500 [08:26<07:43, 1.55it/s, loss=0.0748, lr=1]\nSteps: 52%|█████▏ | 781/1500 [08:26<07:43, 1.55it/s, loss=0.258, lr=1] \nSteps: 52%|█████▏ | 782/1500 [08:26<07:42, 1.55it/s, loss=0.258, lr=1]\nSteps: 52%|█████▏ | 782/1500 [08:26<07:42, 1.55it/s, loss=0.127, lr=1]\nSteps: 52%|█████▏ | 783/1500 [08:27<07:42, 1.55it/s, loss=0.127, lr=1]\nSteps: 52%|█████▏ | 783/1500 [08:27<07:42, 1.55it/s, loss=0.0774, lr=1]\nSteps: 52%|█████▏ | 784/1500 [08:28<07:41, 1.55it/s, loss=0.0774, lr=1]\nSteps: 52%|█████▏ | 784/1500 [08:28<07:41, 1.55it/s, loss=0.244, lr=1] \nSteps: 52%|█████▏ | 785/1500 [08:28<07:44, 1.54it/s, loss=0.244, lr=1]\nSteps: 52%|█████▏ | 785/1500 [08:28<07:44, 1.54it/s, loss=0.0673, lr=1]\nSteps: 52%|█████▏ | 786/1500 [08:29<07:42, 1.54it/s, loss=0.0673, lr=1]\nSteps: 52%|█████▏ | 786/1500 [08:29<07:42, 1.54it/s, loss=0.165, lr=1] \nSteps: 52%|█████▏ | 787/1500 [08:30<07:41, 1.55it/s, loss=0.165, lr=1]\nSteps: 52%|█████▏ | 787/1500 [08:30<07:41, 1.55it/s, loss=0.0884, lr=1]\nSteps: 53%|█████▎ | 788/1500 [08:30<07:40, 1.55it/s, loss=0.0884, lr=1]\nSteps: 53%|█████▎ | 788/1500 [08:30<07:40, 1.55it/s, loss=0.0722, lr=1]\nSteps: 53%|█████▎ | 789/1500 [08:31<07:39, 1.55it/s, loss=0.0722, lr=1]\nSteps: 53%|█████▎ | 789/1500 [08:31<07:39, 1.55it/s, loss=0.138, lr=1] \nSteps: 53%|█████▎ | 790/1500 [08:31<07:38, 1.55it/s, loss=0.138, lr=1]\nSteps: 53%|█████▎ | 790/1500 [08:31<07:38, 1.55it/s, loss=0.159, lr=1]\nSteps: 53%|█████▎ | 791/1500 [08:32<07:37, 1.55it/s, loss=0.159, lr=1]\nSteps: 53%|█████▎ | 791/1500 [08:32<07:37, 1.55it/s, loss=0.0577, lr=1]\nSteps: 53%|█████▎ | 792/1500 [08:33<07:37, 1.55it/s, loss=0.0577, lr=1]\nSteps: 53%|█████▎ | 792/1500 [08:33<07:37, 1.55it/s, loss=0.146, lr=1] \nSteps: 53%|█████▎ | 793/1500 [08:33<07:36, 1.55it/s, loss=0.146, lr=1]\nSteps: 53%|█████▎ | 793/1500 [08:33<07:36, 1.55it/s, loss=0.0831, lr=1]\nSteps: 53%|█████▎ | 794/1500 [08:34<07:35, 1.55it/s, loss=0.0831, lr=1]\nSteps: 53%|█████▎ | 794/1500 [08:34<07:35, 1.55it/s, loss=0.139, lr=1] \nSteps: 53%|█████▎ | 795/1500 [08:35<07:35, 1.55it/s, loss=0.139, lr=1]\nSteps: 53%|█████▎ | 795/1500 [08:35<07:35, 1.55it/s, loss=0.119, lr=1]\nSteps: 53%|█████▎ | 796/1500 [08:35<07:34, 1.55it/s, loss=0.119, lr=1]\nSteps: 53%|█████▎ | 796/1500 [08:35<07:34, 1.55it/s, loss=0.13, lr=1] \nSteps: 53%|█████▎ | 797/1500 [08:36<07:33, 1.55it/s, loss=0.13, lr=1]\nSteps: 53%|█████▎ | 797/1500 [08:36<07:33, 1.55it/s, loss=0.0476, lr=1]\nSteps: 53%|█████▎ | 798/1500 [08:37<07:33, 1.55it/s, loss=0.0476, lr=1]\nSteps: 53%|█████▎ | 798/1500 [08:37<07:33, 1.55it/s, loss=0.14, lr=1] \nSteps: 53%|█████▎ | 799/1500 [08:37<07:32, 1.55it/s, loss=0.14, lr=1]\nSteps: 53%|█████▎ | 799/1500 [08:37<07:32, 1.55it/s, loss=0.114, lr=1]\nSteps: 53%|█████▎ | 800/1500 [08:38<07:31, 1.55it/s, loss=0.114, lr=1]\nSteps: 53%|█████▎ | 800/1500 [08:38<07:31, 1.55it/s, loss=0.158, lr=1]\nSteps: 53%|█████▎ | 801/1500 [08:39<07:33, 1.54it/s, loss=0.158, lr=1]\nSteps: 53%|█████▎ | 801/1500 [08:39<07:33, 1.54it/s, loss=0.16, lr=1] \nSteps: 53%|█████▎ | 802/1500 [08:39<07:31, 1.54it/s, loss=0.16, lr=1]\nSteps: 53%|█████▎ | 802/1500 [08:39<07:31, 1.54it/s, loss=0.108, lr=1]\nSteps: 54%|█████▎ | 803/1500 [08:40<07:30, 1.55it/s, loss=0.108, lr=1]\nSteps: 54%|█████▎ | 803/1500 [08:40<07:30, 1.55it/s, loss=0.0939, lr=1]\nSteps: 54%|█████▎ | 804/1500 [08:41<07:29, 1.55it/s, loss=0.0939, lr=1]\nSteps: 54%|█████▎ | 804/1500 [08:41<07:29, 1.55it/s, loss=0.0802, lr=1]\nSteps: 54%|█████▎ | 805/1500 [08:41<07:27, 1.55it/s, loss=0.0802, lr=1]\nSteps: 54%|█████▎ | 805/1500 [08:41<07:27, 1.55it/s, loss=0.0889, lr=1]\nSteps: 54%|█████▎ | 806/1500 [08:42<07:27, 1.55it/s, loss=0.0889, lr=1]\nSteps: 54%|█████▎ | 806/1500 [08:42<07:27, 1.55it/s, loss=0.0765, lr=1]\nSteps: 54%|█████▍ | 807/1500 [08:42<07:26, 1.55it/s, loss=0.0765, lr=1]\nSteps: 54%|█████▍ | 807/1500 [08:42<07:26, 1.55it/s, loss=0.083, lr=1] \nSteps: 54%|█████▍ | 808/1500 [08:43<07:25, 1.55it/s, loss=0.083, lr=1]\nSteps: 54%|█████▍ | 808/1500 [08:43<07:25, 1.55it/s, loss=0.151, lr=1]\nSteps: 54%|█████▍ | 809/1500 [08:44<07:25, 1.55it/s, loss=0.151, lr=1]\nSteps: 54%|█████▍ | 809/1500 [08:44<07:25, 1.55it/s, loss=0.0682, lr=1]\nSteps: 54%|█████▍ | 810/1500 [08:44<07:24, 1.55it/s, loss=0.0682, lr=1]\nSteps: 54%|█████▍ | 810/1500 [08:44<07:24, 1.55it/s, loss=0.0776, lr=1]\nSteps: 54%|█████▍ | 811/1500 [08:45<07:23, 1.55it/s, loss=0.0776, lr=1]\nSteps: 54%|█████▍ | 811/1500 [08:45<07:23, 1.55it/s, loss=0.166, lr=1] \nSteps: 54%|█████▍ | 812/1500 [08:46<07:23, 1.55it/s, loss=0.166, lr=1]\nSteps: 54%|█████▍ | 812/1500 [08:46<07:23, 1.55it/s, loss=0.142, lr=1]\nSteps: 54%|█████▍ | 813/1500 [08:46<07:22, 1.55it/s, loss=0.142, lr=1]\nSteps: 54%|█████▍ | 813/1500 [08:46<07:22, 1.55it/s, loss=0.12, lr=1] \nSteps: 54%|█████▍ | 814/1500 [08:47<07:22, 1.55it/s, loss=0.12, lr=1]\nSteps: 54%|█████▍ | 814/1500 [08:47<07:22, 1.55it/s, loss=0.119, lr=1]\nSteps: 54%|█████▍ | 815/1500 [08:48<07:21, 1.55it/s, loss=0.119, lr=1]\nSteps: 54%|█████▍ | 815/1500 [08:48<07:21, 1.55it/s, loss=0.122, lr=1]\nSteps: 54%|█████▍ | 816/1500 [08:48<07:20, 1.55it/s, loss=0.122, lr=1]\nSteps: 54%|█████▍ | 816/1500 [08:48<07:20, 1.55it/s, loss=0.23, lr=1] \nSteps: 54%|█████▍ | 817/1500 [08:49<07:22, 1.54it/s, loss=0.23, lr=1]\nSteps: 54%|█████▍ | 817/1500 [08:49<07:22, 1.54it/s, loss=0.162, lr=1]\nSteps: 55%|█████▍ | 818/1500 [08:50<07:20, 1.55it/s, loss=0.162, lr=1]\nSteps: 55%|█████▍ | 818/1500 [08:50<07:20, 1.55it/s, loss=0.159, lr=1]\nSteps: 55%|█████▍ | 819/1500 [08:50<07:19, 1.55it/s, loss=0.159, lr=1]\nSteps: 55%|█████▍ | 819/1500 [08:50<07:19, 1.55it/s, loss=0.0788, lr=1]\nSteps: 55%|█████▍ | 820/1500 [08:51<07:18, 1.55it/s, loss=0.0788, lr=1]\nSteps: 55%|█████▍ | 820/1500 [08:51<07:18, 1.55it/s, loss=0.192, lr=1] \nSteps: 55%|█████▍ | 821/1500 [08:51<07:17, 1.55it/s, loss=0.192, lr=1]\nSteps: 55%|█████▍ | 821/1500 [08:51<07:17, 1.55it/s, loss=0.244, lr=1]\nSteps: 55%|█████▍ | 822/1500 [08:52<07:16, 1.55it/s, loss=0.244, lr=1]\nSteps: 55%|█████▍ | 822/1500 [08:52<07:16, 1.55it/s, loss=0.0713, lr=1]\nSteps: 55%|█████▍ | 823/1500 [08:53<07:16, 1.55it/s, loss=0.0713, lr=1]\nSteps: 55%|█████▍ | 823/1500 [08:53<07:16, 1.55it/s, loss=0.117, lr=1] \nSteps: 55%|█████▍ | 824/1500 [08:53<07:16, 1.55it/s, loss=0.117, lr=1]\nSteps: 55%|█████▍ | 824/1500 [08:53<07:16, 1.55it/s, loss=0.151, lr=1]\nSteps: 55%|█████▌ | 825/1500 [08:54<07:15, 1.55it/s, loss=0.151, lr=1]\nSteps: 55%|█████▌ | 825/1500 [08:54<07:15, 1.55it/s, loss=0.154, lr=1]\nSteps: 55%|█████▌ | 826/1500 [08:55<07:15, 1.55it/s, loss=0.154, lr=1]\nSteps: 55%|█████▌ | 826/1500 [08:55<07:15, 1.55it/s, loss=0.206, lr=1]\nSteps: 55%|█████▌ | 827/1500 [08:55<07:14, 1.55it/s, loss=0.206, lr=1]\nSteps: 55%|█████▌ | 827/1500 [08:55<07:14, 1.55it/s, loss=0.103, lr=1]\nSteps: 55%|█████▌ | 828/1500 [08:56<07:13, 1.55it/s, loss=0.103, lr=1]\nSteps: 55%|█████▌ | 828/1500 [08:56<07:13, 1.55it/s, loss=0.0578, lr=1]\nSteps: 55%|█████▌ | 829/1500 [08:57<07:13, 1.55it/s, loss=0.0578, lr=1]\nSteps: 55%|█████▌ | 829/1500 [08:57<07:13, 1.55it/s, loss=0.282, lr=1] \nSteps: 55%|█████▌ | 830/1500 [08:57<07:12, 1.55it/s, loss=0.282, lr=1]\nSteps: 55%|█████▌ | 830/1500 [08:57<07:12, 1.55it/s, loss=0.145, lr=1]\nSteps: 55%|█████▌ | 831/1500 [08:58<07:12, 1.55it/s, loss=0.145, lr=1]\nSteps: 55%|█████▌ | 831/1500 [08:58<07:12, 1.55it/s, loss=0.126, lr=1]\nSteps: 55%|█████▌ | 832/1500 [08:59<07:11, 1.55it/s, loss=0.126, lr=1]\nSteps: 55%|█████▌ | 832/1500 [08:59<07:11, 1.55it/s, loss=0.154, lr=1]\nSteps: 56%|█████▌ | 833/1500 [08:59<07:13, 1.54it/s, loss=0.154, lr=1]\nSteps: 56%|█████▌ | 833/1500 [08:59<07:13, 1.54it/s, loss=0.12, lr=1] \nSteps: 56%|█████▌ | 834/1500 [09:00<07:11, 1.54it/s, loss=0.12, lr=1]\nSteps: 56%|█████▌ | 834/1500 [09:00<07:11, 1.54it/s, loss=0.11, lr=1]\nSteps: 56%|█████▌ | 835/1500 [09:01<07:10, 1.54it/s, loss=0.11, lr=1]\nSteps: 56%|█████▌ | 835/1500 [09:01<07:10, 1.54it/s, loss=0.104, lr=1]\nSteps: 56%|█████▌ | 836/1500 [09:01<07:09, 1.55it/s, loss=0.104, lr=1]\nSteps: 56%|█████▌ | 836/1500 [09:01<07:09, 1.55it/s, loss=0.151, lr=1]\nSteps: 56%|█████▌ | 837/1500 [09:02<07:08, 1.55it/s, loss=0.151, lr=1]\nSteps: 56%|█████▌ | 837/1500 [09:02<07:08, 1.55it/s, loss=0.196, lr=1]\nSteps: 56%|█████▌ | 838/1500 [09:02<07:07, 1.55it/s, loss=0.196, lr=1]\nSteps: 56%|█████▌ | 838/1500 [09:02<07:07, 1.55it/s, loss=0.0562, lr=1]\nSteps: 56%|█████▌ | 839/1500 [09:03<07:07, 1.55it/s, loss=0.0562, lr=1]\nSteps: 56%|█████▌ | 839/1500 [09:03<07:07, 1.55it/s, loss=0.137, lr=1] \nSteps: 56%|█████▌ | 840/1500 [09:04<07:07, 1.54it/s, loss=0.137, lr=1]\nSteps: 56%|█████▌ | 840/1500 [09:04<07:07, 1.54it/s, loss=0.0799, lr=1]\nSteps: 56%|█████▌ | 841/1500 [09:04<07:06, 1.55it/s, loss=0.0799, lr=1]\nSteps: 56%|█████▌ | 841/1500 [09:04<07:06, 1.55it/s, loss=0.0489, lr=1]\nSteps: 56%|█████▌ | 842/1500 [09:05<07:05, 1.55it/s, loss=0.0489, lr=1]\nSteps: 56%|█████▌ | 842/1500 [09:05<07:05, 1.55it/s, loss=0.0304, lr=1]\nSteps: 56%|█████▌ | 843/1500 [09:06<07:04, 1.55it/s, loss=0.0304, lr=1]\nSteps: 56%|█████▌ | 843/1500 [09:06<07:04, 1.55it/s, loss=0.119, lr=1] \nSteps: 56%|█████▋ | 844/1500 [09:06<07:04, 1.55it/s, loss=0.119, lr=1]\nSteps: 56%|█████▋ | 844/1500 [09:06<07:04, 1.55it/s, loss=0.102, lr=1]\nSteps: 56%|█████▋ | 845/1500 [09:07<07:03, 1.55it/s, loss=0.102, lr=1]\nSteps: 56%|█████▋ | 845/1500 [09:07<07:03, 1.55it/s, loss=0.163, lr=1]\nSteps: 56%|█████▋ | 846/1500 [09:08<07:02, 1.55it/s, loss=0.163, lr=1]\nSteps: 56%|█████▋ | 846/1500 [09:08<07:02, 1.55it/s, loss=0.111, lr=1]\nSteps: 56%|█████▋ | 847/1500 [09:08<07:02, 1.55it/s, loss=0.111, lr=1]\nSteps: 56%|█████▋ | 847/1500 [09:08<07:02, 1.55it/s, loss=0.0447, lr=1]\nSteps: 57%|█████▋ | 848/1500 [09:09<07:01, 1.55it/s, loss=0.0447, lr=1]\nSteps: 57%|█████▋ | 848/1500 [09:09<07:01, 1.55it/s, loss=0.0748, lr=1]\nSteps: 57%|█████▋ | 849/1500 [09:10<07:03, 1.54it/s, loss=0.0748, lr=1]\nSteps: 57%|█████▋ | 849/1500 [09:10<07:03, 1.54it/s, loss=0.164, lr=1] \nSteps: 57%|█████▋ | 850/1500 [09:10<07:02, 1.54it/s, loss=0.164, lr=1]\nSteps: 57%|█████▋ | 850/1500 [09:10<07:02, 1.54it/s, loss=0.119, lr=1]\nSteps: 57%|█████▋ | 851/1500 [09:11<07:01, 1.54it/s, loss=0.119, lr=1]\nSteps: 57%|█████▋ | 851/1500 [09:11<07:01, 1.54it/s, loss=0.0891, lr=1]\nSteps: 57%|█████▋ | 852/1500 [09:12<06:59, 1.54it/s, loss=0.0891, lr=1]\nSteps: 57%|█████▋ | 852/1500 [09:12<06:59, 1.54it/s, loss=0.134, lr=1] \nSteps: 57%|█████▋ | 853/1500 [09:12<06:58, 1.54it/s, loss=0.134, lr=1]\nSteps: 57%|█████▋ | 853/1500 [09:12<06:58, 1.54it/s, loss=0.131, lr=1]\nSteps: 57%|█████▋ | 854/1500 [09:13<06:57, 1.55it/s, loss=0.131, lr=1]\nSteps: 57%|█████▋ | 854/1500 [09:13<06:57, 1.55it/s, loss=0.019, lr=1]\nSteps: 57%|█████▋ | 855/1500 [09:13<06:56, 1.55it/s, loss=0.019, lr=1]\nSteps: 57%|█████▋ | 855/1500 [09:13<06:56, 1.55it/s, loss=0.0788, lr=1]\nSteps: 57%|█████▋ | 856/1500 [09:14<06:55, 1.55it/s, loss=0.0788, lr=1]\nSteps: 57%|█████▋ | 856/1500 [09:14<06:55, 1.55it/s, loss=0.0444, lr=1]\nSteps: 57%|█████▋ | 857/1500 [09:15<06:54, 1.55it/s, loss=0.0444, lr=1]\nSteps: 57%|█████▋ | 857/1500 [09:15<06:54, 1.55it/s, loss=0.307, lr=1] \nSteps: 57%|█████▋ | 858/1500 [09:15<06:53, 1.55it/s, loss=0.307, lr=1]\nSteps: 57%|█████▋ | 858/1500 [09:15<06:53, 1.55it/s, loss=0.106, lr=1]\nSteps: 57%|█████▋ | 859/1500 [09:16<06:52, 1.55it/s, loss=0.106, lr=1]\nSteps: 57%|█████▋ | 859/1500 [09:16<06:52, 1.55it/s, loss=0.231, lr=1]\nSteps: 57%|█████▋ | 860/1500 [09:17<06:52, 1.55it/s, loss=0.231, lr=1]\nSteps: 57%|█████▋ | 860/1500 [09:17<06:52, 1.55it/s, loss=0.0339, lr=1]\nSteps: 57%|█████▋ | 861/1500 [09:17<06:51, 1.55it/s, loss=0.0339, lr=1]\nSteps: 57%|█████▋ | 861/1500 [09:17<06:51, 1.55it/s, loss=0.143, lr=1] \nSteps: 57%|█████▋ | 862/1500 [09:18<06:51, 1.55it/s, loss=0.143, lr=1]\nSteps: 57%|█████▋ | 862/1500 [09:18<06:51, 1.55it/s, loss=0.102, lr=1]\nSteps: 58%|█████▊ | 863/1500 [09:19<06:51, 1.55it/s, loss=0.102, lr=1]\nSteps: 58%|█████▊ | 863/1500 [09:19<06:51, 1.55it/s, loss=0.0576, lr=1]\nSteps: 58%|█████▊ | 864/1500 [09:19<06:50, 1.55it/s, loss=0.0576, lr=1]\nSteps: 58%|█████▊ | 864/1500 [09:19<06:50, 1.55it/s, loss=0.209, lr=1] \nSteps: 58%|█████▊ | 865/1500 [09:20<06:51, 1.54it/s, loss=0.209, lr=1]\nSteps: 58%|█████▊ | 865/1500 [09:20<06:51, 1.54it/s, loss=0.0574, lr=1]\nSteps: 58%|█████▊ | 866/1500 [09:21<06:50, 1.55it/s, loss=0.0574, lr=1]\nSteps: 58%|█████▊ | 866/1500 [09:21<06:50, 1.55it/s, loss=0.198, lr=1] \nSteps: 58%|█████▊ | 867/1500 [09:21<06:48, 1.55it/s, loss=0.198, lr=1]\nSteps: 58%|█████▊ | 867/1500 [09:21<06:48, 1.55it/s, loss=0.0527, lr=1]\nSteps: 58%|█████▊ | 868/1500 [09:22<06:47, 1.55it/s, loss=0.0527, lr=1]\nSteps: 58%|█████▊ | 868/1500 [09:22<06:47, 1.55it/s, loss=0.0919, lr=1]\nSteps: 58%|█████▊ | 869/1500 [09:22<06:46, 1.55it/s, loss=0.0919, lr=1]\nSteps: 58%|█████▊ | 869/1500 [09:22<06:46, 1.55it/s, loss=0.137, lr=1] \nSteps: 58%|█████▊ | 870/1500 [09:23<06:46, 1.55it/s, loss=0.137, lr=1]\nSteps: 58%|█████▊ | 870/1500 [09:23<06:46, 1.55it/s, loss=0.404, lr=1]\nSteps: 58%|█████▊ | 871/1500 [09:24<06:45, 1.55it/s, loss=0.404, lr=1]\nSteps: 58%|█████▊ | 871/1500 [09:24<06:45, 1.55it/s, loss=0.0612, lr=1]\nSteps: 58%|█████▊ | 872/1500 [09:24<06:45, 1.55it/s, loss=0.0612, lr=1]\nSteps: 58%|█████▊ | 872/1500 [09:24<06:45, 1.55it/s, loss=0.104, lr=1] \nSteps: 58%|█████▊ | 873/1500 [09:25<06:44, 1.55it/s, loss=0.104, lr=1]\nSteps: 58%|█████▊ | 873/1500 [09:25<06:44, 1.55it/s, loss=0.182, lr=1]\nSteps: 58%|█████▊ | 874/1500 [09:26<06:44, 1.55it/s, loss=0.182, lr=1]\nSteps: 58%|█████▊ | 874/1500 [09:26<06:44, 1.55it/s, loss=0.103, lr=1]\nSteps: 58%|█████▊ | 875/1500 [09:26<06:43, 1.55it/s, loss=0.103, lr=1]\nSteps: 58%|█████▊ | 875/1500 [09:26<06:43, 1.55it/s, loss=0.134, lr=1]\nSteps: 58%|█████▊ | 876/1500 [09:27<06:42, 1.55it/s, loss=0.134, lr=1]\nSteps: 58%|█████▊ | 876/1500 [09:27<06:42, 1.55it/s, loss=0.0465, lr=1]\nSteps: 58%|█████▊ | 877/1500 [09:28<06:42, 1.55it/s, loss=0.0465, lr=1]\nSteps: 58%|█████▊ | 877/1500 [09:28<06:42, 1.55it/s, loss=0.175, lr=1] \nSteps: 59%|█████▊ | 878/1500 [09:28<06:41, 1.55it/s, loss=0.175, lr=1]\nSteps: 59%|█████▊ | 878/1500 [09:28<06:41, 1.55it/s, loss=0.169, lr=1]\nSteps: 59%|█████▊ | 879/1500 [09:29<06:40, 1.55it/s, loss=0.169, lr=1]\nSteps: 59%|█████▊ | 879/1500 [09:29<06:40, 1.55it/s, loss=0.082, lr=1]\nSteps: 59%|█████▊ | 880/1500 [09:30<06:40, 1.55it/s, loss=0.082, lr=1]\nSteps: 59%|█████▊ | 880/1500 [09:30<06:40, 1.55it/s, loss=0.101, lr=1]\nSteps: 59%|█████▊ | 881/1500 [09:30<06:41, 1.54it/s, loss=0.101, lr=1]\nSteps: 59%|█████▊ | 881/1500 [09:30<06:41, 1.54it/s, loss=0.0826, lr=1]\nSteps: 59%|█████▉ | 882/1500 [09:31<06:41, 1.54it/s, loss=0.0826, lr=1]\nSteps: 59%|█████▉ | 882/1500 [09:31<06:41, 1.54it/s, loss=0.189, lr=1] \nSteps: 59%|█████▉ | 883/1500 [09:32<06:39, 1.54it/s, loss=0.189, lr=1]\nSteps: 59%|█████▉ | 883/1500 [09:32<06:39, 1.54it/s, loss=0.0935, lr=1]\nSteps: 59%|█████▉ | 884/1500 [09:32<06:38, 1.55it/s, loss=0.0935, lr=1]\nSteps: 59%|█████▉ | 884/1500 [09:32<06:38, 1.55it/s, loss=0.173, lr=1] \nSteps: 59%|█████▉ | 885/1500 [09:33<06:37, 1.55it/s, loss=0.173, lr=1]\nSteps: 59%|█████▉ | 885/1500 [09:33<06:37, 1.55it/s, loss=0.262, lr=1]\nSteps: 59%|█████▉ | 886/1500 [09:33<06:36, 1.55it/s, loss=0.262, lr=1]\nSteps: 59%|█████▉ | 886/1500 [09:33<06:36, 1.55it/s, loss=0.272, lr=1]\nSteps: 59%|█████▉ | 887/1500 [09:34<06:35, 1.55it/s, loss=0.272, lr=1]\nSteps: 59%|█████▉ | 887/1500 [09:34<06:35, 1.55it/s, loss=0.134, lr=1]\nSteps: 59%|█████▉ | 888/1500 [09:35<06:35, 1.55it/s, loss=0.134, lr=1]\nSteps: 59%|█████▉ | 888/1500 [09:35<06:35, 1.55it/s, loss=0.071, lr=1]\nSteps: 59%|█████▉ | 889/1500 [09:35<06:34, 1.55it/s, loss=0.071, lr=1]\nSteps: 59%|█████▉ | 889/1500 [09:35<06:34, 1.55it/s, loss=0.105, lr=1]\nSteps: 59%|█████▉ | 890/1500 [09:36<06:32, 1.55it/s, loss=0.105, lr=1]\nSteps: 59%|█████▉ | 890/1500 [09:36<06:32, 1.55it/s, loss=0.0701, lr=1]\nSteps: 59%|█████▉ | 891/1500 [09:37<06:33, 1.55it/s, loss=0.0701, lr=1]\nSteps: 59%|█████▉ | 891/1500 [09:37<06:33, 1.55it/s, loss=0.0797, lr=1]\nSteps: 59%|█████▉ | 892/1500 [09:37<06:32, 1.55it/s, loss=0.0797, lr=1]\nSteps: 59%|█████▉ | 892/1500 [09:37<06:32, 1.55it/s, loss=0.132, lr=1] \nSteps: 60%|█████▉ | 893/1500 [09:38<06:31, 1.55it/s, loss=0.132, lr=1]\nSteps: 60%|█████▉ | 893/1500 [09:38<06:31, 1.55it/s, loss=0.0866, lr=1]\nSteps: 60%|█████▉ | 894/1500 [09:39<06:30, 1.55it/s, loss=0.0866, lr=1]\nSteps: 60%|█████▉ | 894/1500 [09:39<06:30, 1.55it/s, loss=0.0557, lr=1]\nSteps: 60%|█████▉ | 895/1500 [09:39<06:30, 1.55it/s, loss=0.0557, lr=1]\nSteps: 60%|█████▉ | 895/1500 [09:39<06:30, 1.55it/s, loss=0.143, lr=1] \nSteps: 60%|█████▉ | 896/1500 [09:40<06:29, 1.55it/s, loss=0.143, lr=1]\nSteps: 60%|█████▉ | 896/1500 [09:40<06:29, 1.55it/s, loss=0.104, lr=1]\nSteps: 60%|█████▉ | 897/1500 [09:41<06:30, 1.54it/s, loss=0.104, lr=1]\nSteps: 60%|█████▉ | 897/1500 [09:41<06:30, 1.54it/s, loss=0.0695, lr=1]\nSteps: 60%|█████▉ | 898/1500 [09:41<06:29, 1.55it/s, loss=0.0695, lr=1]\nSteps: 60%|█████▉ | 898/1500 [09:41<06:29, 1.55it/s, loss=0.134, lr=1] \nSteps: 60%|█████▉ | 899/1500 [09:42<06:28, 1.55it/s, loss=0.134, lr=1]\nSteps: 60%|█████▉ | 899/1500 [09:42<06:28, 1.55it/s, loss=0.0307, lr=1]\nSteps: 60%|██████ | 900/1500 [09:43<06:28, 1.55it/s, loss=0.0307, lr=1]\nSteps: 60%|██████ | 900/1500 [09:43<06:28, 1.55it/s, loss=0.0581, lr=1]\nSteps: 60%|██████ | 901/1500 [09:43<06:27, 1.55it/s, loss=0.0581, lr=1]\nSteps: 60%|██████ | 901/1500 [09:43<06:27, 1.55it/s, loss=0.193, lr=1] \nSteps: 60%|██████ | 902/1500 [09:44<06:27, 1.54it/s, loss=0.193, lr=1]\nSteps: 60%|██████ | 902/1500 [09:44<06:27, 1.54it/s, loss=0.101, lr=1]\nSteps: 60%|██████ | 903/1500 [09:44<06:26, 1.55it/s, loss=0.101, lr=1]\nSteps: 60%|██████ | 903/1500 [09:44<06:26, 1.55it/s, loss=0.152, lr=1]\nSteps: 60%|██████ | 904/1500 [09:45<06:25, 1.55it/s, loss=0.152, lr=1]\nSteps: 60%|██████ | 904/1500 [09:45<06:25, 1.55it/s, loss=0.0389, lr=1]\nSteps: 60%|██████ | 905/1500 [09:46<06:24, 1.55it/s, loss=0.0389, lr=1]\nSteps: 60%|██████ | 905/1500 [09:46<06:24, 1.55it/s, loss=0.17, lr=1] \nSteps: 60%|██████ | 906/1500 [09:46<06:23, 1.55it/s, loss=0.17, lr=1]\nSteps: 60%|██████ | 906/1500 [09:46<06:23, 1.55it/s, loss=0.135, lr=1]\nSteps: 60%|██████ | 907/1500 [09:47<06:22, 1.55it/s, loss=0.135, lr=1]\nSteps: 60%|██████ | 907/1500 [09:47<06:22, 1.55it/s, loss=0.0728, lr=1]\nSteps: 61%|██████ | 908/1500 [09:48<06:21, 1.55it/s, loss=0.0728, lr=1]\nSteps: 61%|██████ | 908/1500 [09:48<06:21, 1.55it/s, loss=0.146, lr=1] \nSteps: 61%|██████ | 909/1500 [09:48<06:21, 1.55it/s, loss=0.146, lr=1]\nSteps: 61%|██████ | 909/1500 [09:48<06:21, 1.55it/s, loss=0.113, lr=1]\nSteps: 61%|██████ | 910/1500 [09:49<06:20, 1.55it/s, loss=0.113, lr=1]\nSteps: 61%|██████ | 910/1500 [09:49<06:20, 1.55it/s, loss=0.0995, lr=1]\nSteps: 61%|██████ | 911/1500 [09:50<06:20, 1.55it/s, loss=0.0995, lr=1]\nSteps: 61%|██████ | 911/1500 [09:50<06:20, 1.55it/s, loss=0.115, lr=1] \nSteps: 61%|██████ | 912/1500 [09:50<06:19, 1.55it/s, loss=0.115, lr=1]\nSteps: 61%|██████ | 912/1500 [09:50<06:19, 1.55it/s, loss=0.0825, lr=1]\nSteps: 61%|██████ | 913/1500 [09:51<06:21, 1.54it/s, loss=0.0825, lr=1]\nSteps: 61%|██████ | 913/1500 [09:51<06:21, 1.54it/s, loss=0.102, lr=1] \nSteps: 61%|██████ | 914/1500 [09:52<06:20, 1.54it/s, loss=0.102, lr=1]\nSteps: 61%|██████ | 914/1500 [09:52<06:20, 1.54it/s, loss=0.182, lr=1]\nSteps: 61%|██████ | 915/1500 [09:52<06:19, 1.54it/s, loss=0.182, lr=1]\nSteps: 61%|██████ | 915/1500 [09:52<06:19, 1.54it/s, loss=0.0937, lr=1]\nSteps: 61%|██████ | 916/1500 [09:53<06:18, 1.54it/s, loss=0.0937, lr=1]\nSteps: 61%|██████ | 916/1500 [09:53<06:18, 1.54it/s, loss=0.159, lr=1] \nSteps: 61%|██████ | 917/1500 [09:54<06:17, 1.54it/s, loss=0.159, lr=1]\nSteps: 61%|██████ | 917/1500 [09:54<06:17, 1.54it/s, loss=0.0698, lr=1]\nSteps: 61%|██████ | 918/1500 [09:54<06:17, 1.54it/s, loss=0.0698, lr=1]\nSteps: 61%|██████ | 918/1500 [09:54<06:17, 1.54it/s, loss=0.195, lr=1] \nSteps: 61%|██████▏ | 919/1500 [09:55<06:16, 1.54it/s, loss=0.195, lr=1]\nSteps: 61%|██████▏ | 919/1500 [09:55<06:16, 1.54it/s, loss=0.0995, lr=1]\nSteps: 61%|██████▏ | 920/1500 [09:55<06:15, 1.54it/s, loss=0.0995, lr=1]\nSteps: 61%|██████▏ | 920/1500 [09:55<06:15, 1.54it/s, loss=0.171, lr=1] \nSteps: 61%|██████▏ | 921/1500 [09:56<06:14, 1.54it/s, loss=0.171, lr=1]\nSteps: 61%|██████▏ | 921/1500 [09:56<06:14, 1.54it/s, loss=0.171, lr=1]\nSteps: 61%|██████▏ | 922/1500 [09:57<06:14, 1.54it/s, loss=0.171, lr=1]\nSteps: 61%|██████▏ | 922/1500 [09:57<06:14, 1.54it/s, loss=0.147, lr=1]\nSteps: 62%|██████▏ | 923/1500 [09:57<06:13, 1.55it/s, loss=0.147, lr=1]\nSteps: 62%|██████▏ | 923/1500 [09:57<06:13, 1.55it/s, loss=0.234, lr=1]\nSteps: 62%|██████▏ | 924/1500 [09:58<06:12, 1.55it/s, loss=0.234, lr=1]\nSteps: 62%|██████▏ | 924/1500 [09:58<06:12, 1.55it/s, loss=0.106, lr=1]\nSteps: 62%|██████▏ | 925/1500 [09:59<06:11, 1.55it/s, loss=0.106, lr=1]\nSteps: 62%|██████▏ | 925/1500 [09:59<06:11, 1.55it/s, loss=0.122, lr=1]\nSteps: 62%|██████▏ | 926/1500 [09:59<06:10, 1.55it/s, loss=0.122, lr=1]\nSteps: 62%|██████▏ | 926/1500 [09:59<06:10, 1.55it/s, loss=0.134, lr=1]\nSteps: 62%|██████▏ | 927/1500 [10:00<06:10, 1.55it/s, loss=0.134, lr=1]\nSteps: 62%|██████▏ | 927/1500 [10:00<06:10, 1.55it/s, loss=0.126, lr=1]\nSteps: 62%|██████▏ | 928/1500 [10:01<06:09, 1.55it/s, loss=0.126, lr=1]\nSteps: 62%|██████▏ | 928/1500 [10:01<06:09, 1.55it/s, loss=0.0785, lr=1]\nSteps: 62%|██████▏ | 929/1500 [10:01<06:11, 1.54it/s, loss=0.0785, lr=1]\nSteps: 62%|██████▏ | 929/1500 [10:01<06:11, 1.54it/s, loss=0.142, lr=1] \nSteps: 62%|██████▏ | 930/1500 [10:02<06:10, 1.54it/s, loss=0.142, lr=1]\nSteps: 62%|██████▏ | 930/1500 [10:02<06:10, 1.54it/s, loss=0.149, lr=1]\nSteps: 62%|██████▏ | 931/1500 [10:03<06:08, 1.54it/s, loss=0.149, lr=1]\nSteps: 62%|██████▏ | 931/1500 [10:03<06:08, 1.54it/s, loss=0.157, lr=1]\nSteps: 62%|██████▏ | 932/1500 [10:03<06:08, 1.54it/s, loss=0.157, lr=1]\nSteps: 62%|██████▏ | 932/1500 [10:03<06:08, 1.54it/s, loss=0.173, lr=1]\nSteps: 62%|██████▏ | 933/1500 [10:04<06:07, 1.54it/s, loss=0.173, lr=1]\nSteps: 62%|██████▏ | 933/1500 [10:04<06:07, 1.54it/s, loss=0.0993, lr=1]\nSteps: 62%|██████▏ | 934/1500 [10:05<06:06, 1.55it/s, loss=0.0993, lr=1]\nSteps: 62%|██████▏ | 934/1500 [10:05<06:06, 1.55it/s, loss=0.106, lr=1] \nSteps: 62%|██████▏ | 935/1500 [10:05<06:05, 1.55it/s, loss=0.106, lr=1]\nSteps: 62%|██████▏ | 935/1500 [10:05<06:05, 1.55it/s, loss=0.224, lr=1]\nSteps: 62%|██████▏ | 936/1500 [10:06<06:05, 1.54it/s, loss=0.224, lr=1]\nSteps: 62%|██████▏ | 936/1500 [10:06<06:05, 1.54it/s, loss=0.0994, lr=1]\nSteps: 62%|██████▏ | 937/1500 [10:07<06:26, 1.46it/s, loss=0.0994, lr=1]\nSteps: 62%|██████▏ | 937/1500 [10:07<06:26, 1.46it/s, loss=0.118, lr=1] \nSteps: 63%|██████▎ | 938/1500 [10:07<06:19, 1.48it/s, loss=0.118, lr=1]\nSteps: 63%|██████▎ | 938/1500 [10:07<06:19, 1.48it/s, loss=0.133, lr=1]\nSteps: 63%|██████▎ | 939/1500 [10:08<06:14, 1.50it/s, loss=0.133, lr=1]\nSteps: 63%|██████▎ | 939/1500 [10:08<06:14, 1.50it/s, loss=0.308, lr=1]\nSteps: 63%|██████▎ | 940/1500 [10:09<06:09, 1.52it/s, loss=0.308, lr=1]\nSteps: 63%|██████▎ | 940/1500 [10:09<06:09, 1.52it/s, loss=0.133, lr=1]\nSteps: 63%|██████▎ | 941/1500 [10:09<06:06, 1.53it/s, loss=0.133, lr=1]\nSteps: 63%|██████▎ | 941/1500 [10:09<06:06, 1.53it/s, loss=0.132, lr=1]\nSteps: 63%|██████▎ | 942/1500 [10:10<06:04, 1.53it/s, loss=0.132, lr=1]\nSteps: 63%|██████▎ | 942/1500 [10:10<06:04, 1.53it/s, loss=0.0924, lr=1]\nSteps: 63%|██████▎ | 943/1500 [10:10<06:02, 1.54it/s, loss=0.0924, lr=1]\nSteps: 63%|██████▎ | 943/1500 [10:10<06:02, 1.54it/s, loss=0.0814, lr=1]\nSteps: 63%|██████▎ | 944/1500 [10:11<06:00, 1.54it/s, loss=0.0814, lr=1]\nSteps: 63%|██████▎ | 944/1500 [10:11<06:00, 1.54it/s, loss=0.0831, lr=1]\nSteps: 63%|██████▎ | 945/1500 [10:12<06:02, 1.53it/s, loss=0.0831, lr=1]\nSteps: 63%|██████▎ | 945/1500 [10:12<06:02, 1.53it/s, loss=0.0933, lr=1]\nSteps: 63%|██████▎ | 946/1500 [10:12<06:00, 1.54it/s, loss=0.0933, lr=1]\nSteps: 63%|██████▎ | 946/1500 [10:12<06:00, 1.54it/s, loss=0.139, lr=1] \nSteps: 63%|██████▎ | 947/1500 [10:13<05:58, 1.54it/s, loss=0.139, lr=1]\nSteps: 63%|██████▎ | 947/1500 [10:13<05:58, 1.54it/s, loss=0.0964, lr=1]\nSteps: 63%|██████▎ | 948/1500 [10:14<05:57, 1.54it/s, loss=0.0964, lr=1]\nSteps: 63%|██████▎ | 948/1500 [10:14<05:57, 1.54it/s, loss=0.0664, lr=1]\nSteps: 63%|██████▎ | 949/1500 [10:14<05:56, 1.54it/s, loss=0.0664, lr=1]\nSteps: 63%|██████▎ | 949/1500 [10:14<05:56, 1.54it/s, loss=0.108, lr=1] \nSteps: 63%|██████▎ | 950/1500 [10:15<05:55, 1.55it/s, loss=0.108, lr=1]\nSteps: 63%|██████▎ | 950/1500 [10:15<05:55, 1.55it/s, loss=0.0834, lr=1]\nSteps: 63%|██████▎ | 951/1500 [10:16<05:54, 1.55it/s, loss=0.0834, lr=1]\nSteps: 63%|██████▎ | 951/1500 [10:16<05:54, 1.55it/s, loss=0.109, lr=1] \nSteps: 63%|██████▎ | 952/1500 [10:16<05:54, 1.54it/s, loss=0.109, lr=1]\nSteps: 63%|██████▎ | 952/1500 [10:16<05:54, 1.54it/s, loss=0.218, lr=1]\nSteps: 64%|██████▎ | 953/1500 [10:17<05:53, 1.55it/s, loss=0.218, lr=1]\nSteps: 64%|██████▎ | 953/1500 [10:17<05:53, 1.55it/s, loss=0.0897, lr=1]\nSteps: 64%|██████▎ | 954/1500 [10:18<05:52, 1.55it/s, loss=0.0897, lr=1]\nSteps: 64%|██████▎ | 954/1500 [10:18<05:52, 1.55it/s, loss=0.0709, lr=1]\nSteps: 64%|██████▎ | 955/1500 [10:18<05:52, 1.55it/s, loss=0.0709, lr=1]\nSteps: 64%|██████▎ | 955/1500 [10:18<05:52, 1.55it/s, loss=0.223, lr=1] \nSteps: 64%|██████▎ | 956/1500 [10:19<05:51, 1.55it/s, loss=0.223, lr=1]\nSteps: 64%|██████▎ | 956/1500 [10:19<05:51, 1.55it/s, loss=0.14, lr=1] \nSteps: 64%|██████▍ | 957/1500 [10:20<05:50, 1.55it/s, loss=0.14, lr=1]\nSteps: 64%|██████▍ | 957/1500 [10:20<05:50, 1.55it/s, loss=0.14, lr=1]\nSteps: 64%|██████▍ | 958/1500 [10:20<05:49, 1.55it/s, loss=0.14, lr=1]\nSteps: 64%|██████▍ | 958/1500 [10:20<05:49, 1.55it/s, loss=0.118, lr=1]\nSteps: 64%|██████▍ | 959/1500 [10:21<05:49, 1.55it/s, loss=0.118, lr=1]\nSteps: 64%|██████▍ | 959/1500 [10:21<05:49, 1.55it/s, loss=0.136, lr=1]\nSteps: 64%|██████▍ | 960/1500 [10:21<05:48, 1.55it/s, loss=0.136, lr=1]\nSteps: 64%|██████▍ | 960/1500 [10:21<05:48, 1.55it/s, loss=0.0722, lr=1]\nSteps: 64%|██████▍ | 961/1500 [10:22<05:49, 1.54it/s, loss=0.0722, lr=1]\nSteps: 64%|██████▍ | 961/1500 [10:22<05:49, 1.54it/s, loss=0.0631, lr=1]\nSteps: 64%|██████▍ | 962/1500 [10:23<05:48, 1.54it/s, loss=0.0631, lr=1]\nSteps: 64%|██████▍ | 962/1500 [10:23<05:48, 1.54it/s, loss=0.118, lr=1] \nSteps: 64%|██████▍ | 963/1500 [10:23<05:47, 1.55it/s, loss=0.118, lr=1]\nSteps: 64%|██████▍ | 963/1500 [10:23<05:47, 1.55it/s, loss=0.0361, lr=1]\nSteps: 64%|██████▍ | 964/1500 [10:24<05:46, 1.55it/s, loss=0.0361, lr=1]\nSteps: 64%|██████▍ | 964/1500 [10:24<05:46, 1.55it/s, loss=0.137, lr=1] \nSteps: 64%|██████▍ | 965/1500 [10:25<05:45, 1.55it/s, loss=0.137, lr=1]\nSteps: 64%|██████▍ | 965/1500 [10:25<05:45, 1.55it/s, loss=0.116, lr=1]\nSteps: 64%|██████▍ | 966/1500 [10:25<05:44, 1.55it/s, loss=0.116, lr=1]\nSteps: 64%|██████▍ | 966/1500 [10:25<05:44, 1.55it/s, loss=0.124, lr=1]\nSteps: 64%|██████▍ | 967/1500 [10:26<05:43, 1.55it/s, loss=0.124, lr=1]\nSteps: 64%|██████▍ | 967/1500 [10:26<05:43, 1.55it/s, loss=0.0765, lr=1]\nSteps: 65%|██████▍ | 968/1500 [10:27<05:43, 1.55it/s, loss=0.0765, lr=1]\nSteps: 65%|██████▍ | 968/1500 [10:27<05:43, 1.55it/s, loss=0.117, lr=1] \nSteps: 65%|██████▍ | 969/1500 [10:27<05:42, 1.55it/s, loss=0.117, lr=1]\nSteps: 65%|██████▍ | 969/1500 [10:27<05:42, 1.55it/s, loss=0.152, lr=1]\nSteps: 65%|██████▍ | 970/1500 [10:28<05:41, 1.55it/s, loss=0.152, lr=1]\nSteps: 65%|██████▍ | 970/1500 [10:28<05:41, 1.55it/s, loss=0.0291, lr=1]\nSteps: 65%|██████▍ | 971/1500 [10:29<05:41, 1.55it/s, loss=0.0291, lr=1]\nSteps: 65%|██████▍ | 971/1500 [10:29<05:41, 1.55it/s, loss=0.268, lr=1] \nSteps: 65%|██████▍ | 972/1500 [10:29<05:41, 1.55it/s, loss=0.268, lr=1]\nSteps: 65%|██████▍ | 972/1500 [10:29<05:41, 1.55it/s, loss=0.134, lr=1]\nSteps: 65%|██████▍ | 973/1500 [10:30<05:40, 1.55it/s, loss=0.134, lr=1]\nSteps: 65%|██████▍ | 973/1500 [10:30<05:40, 1.55it/s, loss=0.145, lr=1]\nSteps: 65%|██████▍ | 974/1500 [10:31<05:39, 1.55it/s, loss=0.145, lr=1]\nSteps: 65%|██████▍ | 974/1500 [10:31<05:39, 1.55it/s, loss=0.071, lr=1]\nSteps: 65%|██████▌ | 975/1500 [10:31<05:38, 1.55it/s, loss=0.071, lr=1]\nSteps: 65%|██████▌ | 975/1500 [10:31<05:38, 1.55it/s, loss=0.0721, lr=1]\nSteps: 65%|██████▌ | 976/1500 [10:32<05:38, 1.55it/s, loss=0.0721, lr=1]\nSteps: 65%|██████▌ | 976/1500 [10:32<05:38, 1.55it/s, loss=0.0855, lr=1]\nSteps: 65%|██████▌ | 977/1500 [10:32<05:39, 1.54it/s, loss=0.0855, lr=1]\nSteps: 65%|██████▌ | 977/1500 [10:32<05:39, 1.54it/s, loss=0.0741, lr=1]\nSteps: 65%|██████▌ | 978/1500 [10:33<05:37, 1.55it/s, loss=0.0741, lr=1]\nSteps: 65%|██████▌ | 978/1500 [10:33<05:37, 1.55it/s, loss=0.109, lr=1] \nSteps: 65%|██████▌ | 979/1500 [10:34<05:36, 1.55it/s, loss=0.109, lr=1]\nSteps: 65%|██████▌ | 979/1500 [10:34<05:36, 1.55it/s, loss=0.0505, lr=1]\nSteps: 65%|██████▌ | 980/1500 [10:34<05:36, 1.55it/s, loss=0.0505, lr=1]\nSteps: 65%|██████▌ | 980/1500 [10:34<05:36, 1.55it/s, loss=0.0594, lr=1]\nSteps: 65%|██████▌ | 981/1500 [10:35<05:35, 1.55it/s, loss=0.0594, lr=1]\nSteps: 65%|██████▌ | 981/1500 [10:35<05:35, 1.55it/s, loss=0.21, lr=1] \nSteps: 65%|██████▌ | 982/1500 [10:36<05:34, 1.55it/s, loss=0.21, lr=1]\nSteps: 65%|██████▌ | 982/1500 [10:36<05:34, 1.55it/s, loss=0.184, lr=1]\nSteps: 66%|██████▌ | 983/1500 [10:36<05:33, 1.55it/s, loss=0.184, lr=1]\nSteps: 66%|██████▌ | 983/1500 [10:36<05:33, 1.55it/s, loss=0.152, lr=1]\nSteps: 66%|██████▌ | 984/1500 [10:37<05:32, 1.55it/s, loss=0.152, lr=1]\nSteps: 66%|██████▌ | 984/1500 [10:37<05:32, 1.55it/s, loss=0.0604, lr=1]\nSteps: 66%|██████▌ | 985/1500 [10:38<06:15, 1.37it/s, loss=0.0604, lr=1]\nSteps: 66%|██████▌ | 985/1500 [10:38<06:15, 1.37it/s, loss=0.116, lr=1] \nSteps: 66%|██████▌ | 986/1500 [10:39<06:01, 1.42it/s, loss=0.116, lr=1]\nSteps: 66%|██████▌ | 986/1500 [10:39<06:01, 1.42it/s, loss=0.169, lr=1]\nSteps: 66%|██████▌ | 987/1500 [10:39<05:51, 1.46it/s, loss=0.169, lr=1]\nSteps: 66%|██████▌ | 987/1500 [10:39<05:51, 1.46it/s, loss=0.129, lr=1]\nSteps: 66%|██████▌ | 988/1500 [10:40<05:44, 1.49it/s, loss=0.129, lr=1]\nSteps: 66%|██████▌ | 988/1500 [10:40<05:44, 1.49it/s, loss=0.106, lr=1]\nSteps: 66%|██████▌ | 989/1500 [10:40<05:39, 1.51it/s, loss=0.106, lr=1]\nSteps: 66%|██████▌ | 989/1500 [10:40<05:39, 1.51it/s, loss=0.117, lr=1]\nSteps: 66%|██████▌ | 990/1500 [10:41<05:35, 1.52it/s, loss=0.117, lr=1]\nSteps: 66%|██████▌ | 990/1500 [10:41<05:35, 1.52it/s, loss=0.176, lr=1]\nSteps: 66%|██████▌ | 991/1500 [10:42<05:33, 1.53it/s, loss=0.176, lr=1]\nSteps: 66%|██████▌ | 991/1500 [10:42<05:33, 1.53it/s, loss=0.0918, lr=1]\nSteps: 66%|██████▌ | 992/1500 [10:42<05:30, 1.53it/s, loss=0.0918, lr=1]\nSteps: 66%|██████▌ | 992/1500 [10:42<05:30, 1.53it/s, loss=0.0964, lr=1]\nSteps: 66%|██████▌ | 993/1500 [10:43<05:31, 1.53it/s, loss=0.0964, lr=1]\nSteps: 66%|██████▌ | 993/1500 [10:43<05:31, 1.53it/s, loss=0.116, lr=1] \nSteps: 66%|██████▋ | 994/1500 [10:44<05:31, 1.53it/s, loss=0.116, lr=1]\nSteps: 66%|██████▋ | 994/1500 [10:44<05:31, 1.53it/s, loss=0.0747, lr=1]\nSteps: 66%|██████▋ | 995/1500 [10:44<05:29, 1.53it/s, loss=0.0747, lr=1]\nSteps: 66%|██████▋ | 995/1500 [10:44<05:29, 1.53it/s, loss=0.207, lr=1] \nSteps: 66%|██████▋ | 996/1500 [10:45<05:27, 1.54it/s, loss=0.207, lr=1]\nSteps: 66%|██████▋ | 996/1500 [10:45<05:27, 1.54it/s, loss=0.0351, lr=1]\nSteps: 66%|██████▋ | 997/1500 [10:46<05:26, 1.54it/s, loss=0.0351, lr=1]\nSteps: 66%|██████▋ | 997/1500 [10:46<05:26, 1.54it/s, loss=0.0997, lr=1]\nSteps: 67%|██████▋ | 998/1500 [10:46<05:25, 1.54it/s, loss=0.0997, lr=1]\nSteps: 67%|██████▋ | 998/1500 [10:46<05:25, 1.54it/s, loss=0.201, lr=1] \nSteps: 67%|██████▋ | 999/1500 [10:47<05:24, 1.54it/s, loss=0.201, lr=1]\nSteps: 67%|██████▋ | 999/1500 [10:47<05:24, 1.54it/s, loss=0.289, lr=1]\nSteps: 67%|██████▋ | 1000/1500 [10:48<05:23, 1.55it/s, loss=0.289, lr=1]\nSteps: 67%|██████▋ | 1000/1500 [10:48<05:23, 1.55it/s, loss=0.0773, lr=1]\nSteps: 67%|██████▋ | 1001/1500 [10:48<05:23, 1.54it/s, loss=0.0773, lr=1]\nSteps: 67%|██████▋ | 1001/1500 [10:48<05:23, 1.54it/s, loss=0.0843, lr=1]\nSteps: 67%|██████▋ | 1002/1500 [10:49<05:22, 1.54it/s, loss=0.0843, lr=1]\nSteps: 67%|██████▋ | 1002/1500 [10:49<05:22, 1.54it/s, loss=0.231, lr=1] \nSteps: 67%|██████▋ | 1003/1500 [10:50<05:21, 1.55it/s, loss=0.231, lr=1]\nSteps: 67%|██████▋ | 1003/1500 [10:50<05:21, 1.55it/s, loss=0.0999, lr=1]\nSteps: 67%|██████▋ | 1004/1500 [10:50<05:20, 1.55it/s, loss=0.0999, lr=1]\nSteps: 67%|██████▋ | 1004/1500 [10:50<05:20, 1.55it/s, loss=0.193, lr=1] \nSteps: 67%|██████▋ | 1005/1500 [10:51<05:19, 1.55it/s, loss=0.193, lr=1]\nSteps: 67%|██████▋ | 1005/1500 [10:51<05:19, 1.55it/s, loss=0.0955, lr=1]\nSteps: 67%|██████▋ | 1006/1500 [10:51<05:18, 1.55it/s, loss=0.0955, lr=1]\nSteps: 67%|██████▋ | 1006/1500 [10:51<05:18, 1.55it/s, loss=0.0954, lr=1]\nSteps: 67%|██████▋ | 1007/1500 [10:52<05:18, 1.55it/s, loss=0.0954, lr=1]\nSteps: 67%|██████▋ | 1007/1500 [10:52<05:18, 1.55it/s, loss=0.273, lr=1] \nSteps: 67%|██████▋ | 1008/1500 [10:53<05:17, 1.55it/s, loss=0.273, lr=1]\nSteps: 67%|██████▋ | 1008/1500 [10:53<05:17, 1.55it/s, loss=0.136, lr=1]\nSteps: 67%|██████▋ | 1009/1500 [10:53<05:18, 1.54it/s, loss=0.136, lr=1]\nSteps: 67%|██████▋ | 1009/1500 [10:53<05:18, 1.54it/s, loss=0.0223, lr=1]\nSteps: 67%|██████▋ | 1010/1500 [10:54<05:17, 1.54it/s, loss=0.0223, lr=1]\nSteps: 67%|██████▋ | 1010/1500 [10:54<05:17, 1.54it/s, loss=0.085, lr=1] \nSteps: 67%|██████▋ | 1011/1500 [10:55<05:17, 1.54it/s, loss=0.085, lr=1]\nSteps: 67%|██████▋ | 1011/1500 [10:55<05:17, 1.54it/s, loss=0.0704, lr=1]\nSteps: 67%|██████▋ | 1012/1500 [10:55<05:16, 1.54it/s, loss=0.0704, lr=1]\nSteps: 67%|██████▋ | 1012/1500 [10:55<05:16, 1.54it/s, loss=0.0788, lr=1]\nSteps: 68%|██████▊ | 1013/1500 [10:56<05:15, 1.54it/s, loss=0.0788, lr=1]\nSteps: 68%|██████▊ | 1013/1500 [10:56<05:15, 1.54it/s, loss=0.279, lr=1] \nSteps: 68%|██████▊ | 1014/1500 [10:57<05:14, 1.54it/s, loss=0.279, lr=1]\nSteps: 68%|██████▊ | 1014/1500 [10:57<05:14, 1.54it/s, loss=0.0669, lr=1]\nSteps: 68%|██████▊ | 1015/1500 [10:57<05:14, 1.54it/s, loss=0.0669, lr=1]\nSteps: 68%|██████▊ | 1015/1500 [10:57<05:14, 1.54it/s, loss=0.122, lr=1] \nSteps: 68%|██████▊ | 1016/1500 [10:58<05:13, 1.54it/s, loss=0.122, lr=1]\nSteps: 68%|██████▊ | 1016/1500 [10:58<05:13, 1.54it/s, loss=0.0945, lr=1]\nSteps: 68%|██████▊ | 1017/1500 [10:59<05:12, 1.55it/s, loss=0.0945, lr=1]\nSteps: 68%|██████▊ | 1017/1500 [10:59<05:12, 1.55it/s, loss=0.121, lr=1] \nSteps: 68%|██████▊ | 1018/1500 [10:59<05:11, 1.54it/s, loss=0.121, lr=1]\nSteps: 68%|██████▊ | 1018/1500 [10:59<05:11, 1.54it/s, loss=0.121, lr=1]\nSteps: 68%|██████▊ | 1019/1500 [11:00<05:11, 1.54it/s, loss=0.121, lr=1]\nSteps: 68%|██████▊ | 1019/1500 [11:00<05:11, 1.54it/s, loss=0.232, lr=1]\nSteps: 68%|██████▊ | 1020/1500 [11:01<05:10, 1.54it/s, loss=0.232, lr=1]\nSteps: 68%|██████▊ | 1020/1500 [11:01<05:10, 1.54it/s, loss=0.085, lr=1]\nSteps: 68%|██████▊ | 1021/1500 [11:01<05:10, 1.54it/s, loss=0.085, lr=1]\nSteps: 68%|██████▊ | 1021/1500 [11:01<05:10, 1.54it/s, loss=0.0912, lr=1]\nSteps: 68%|██████▊ | 1022/1500 [11:02<05:10, 1.54it/s, loss=0.0912, lr=1]\nSteps: 68%|██████▊ | 1022/1500 [11:02<05:10, 1.54it/s, loss=0.0651, lr=1]\nSteps: 68%|██████▊ | 1023/1500 [11:02<05:09, 1.54it/s, loss=0.0651, lr=1]\nSteps: 68%|██████▊ | 1023/1500 [11:02<05:09, 1.54it/s, loss=0.0371, lr=1]\nSteps: 68%|██████▊ | 1024/1500 [11:03<05:08, 1.54it/s, loss=0.0371, lr=1]\nSteps: 68%|██████▊ | 1024/1500 [11:03<05:08, 1.54it/s, loss=0.117, lr=1] \nSteps: 68%|██████▊ | 1025/1500 [11:04<05:10, 1.53it/s, loss=0.117, lr=1]\nSteps: 68%|██████▊ | 1025/1500 [11:04<05:10, 1.53it/s, loss=0.107, lr=1]\nSteps: 68%|██████▊ | 1026/1500 [11:04<05:09, 1.53it/s, loss=0.107, lr=1]\nSteps: 68%|██████▊ | 1026/1500 [11:04<05:09, 1.53it/s, loss=0.121, lr=1]\nSteps: 68%|██████▊ | 1027/1500 [11:05<05:07, 1.54it/s, loss=0.121, lr=1]\nSteps: 68%|██████▊ | 1027/1500 [11:05<05:07, 1.54it/s, loss=0.152, lr=1]\nSteps: 69%|██████▊ | 1028/1500 [11:06<05:06, 1.54it/s, loss=0.152, lr=1]\nSteps: 69%|██████▊ | 1028/1500 [11:06<05:06, 1.54it/s, loss=0.248, lr=1]\nSteps: 69%|██████▊ | 1029/1500 [11:06<05:05, 1.54it/s, loss=0.248, lr=1]\nSteps: 69%|██████▊ | 1029/1500 [11:06<05:05, 1.54it/s, loss=0.0565, lr=1]\nSteps: 69%|██████▊ | 1030/1500 [11:07<05:04, 1.54it/s, loss=0.0565, lr=1]\nSteps: 69%|██████▊ | 1030/1500 [11:07<05:04, 1.54it/s, loss=0.0352, lr=1]\nSteps: 69%|██████▊ | 1031/1500 [11:08<05:03, 1.55it/s, loss=0.0352, lr=1]\nSteps: 69%|██████▊ | 1031/1500 [11:08<05:03, 1.55it/s, loss=0.0997, lr=1]\nSteps: 69%|██████▉ | 1032/1500 [11:08<05:02, 1.55it/s, loss=0.0997, lr=1]\nSteps: 69%|██████▉ | 1032/1500 [11:08<05:02, 1.55it/s, loss=0.115, lr=1] \nSteps: 69%|██████▉ | 1033/1500 [11:09<05:01, 1.55it/s, loss=0.115, lr=1]\nSteps: 69%|██████▉ | 1033/1500 [11:09<05:01, 1.55it/s, loss=0.218, lr=1]\nSteps: 69%|██████▉ | 1034/1500 [11:10<05:01, 1.55it/s, loss=0.218, lr=1]\nSteps: 69%|██████▉ | 1034/1500 [11:10<05:01, 1.55it/s, loss=0.0188, lr=1]\nSteps: 69%|██████▉ | 1035/1500 [11:10<05:00, 1.55it/s, loss=0.0188, lr=1]\nSteps: 69%|██████▉ | 1035/1500 [11:10<05:00, 1.55it/s, loss=0.139, lr=1] \nSteps: 69%|██████▉ | 1036/1500 [11:11<04:59, 1.55it/s, loss=0.139, lr=1]\nSteps: 69%|██████▉ | 1036/1500 [11:11<04:59, 1.55it/s, loss=0.191, lr=1]\nSteps: 69%|██████▉ | 1037/1500 [11:12<04:59, 1.55it/s, loss=0.191, lr=1]\nSteps: 69%|██████▉ | 1037/1500 [11:12<04:59, 1.55it/s, loss=0.103, lr=1]\nSteps: 69%|██████▉ | 1038/1500 [11:12<04:59, 1.54it/s, loss=0.103, lr=1]\nSteps: 69%|██████▉ | 1038/1500 [11:12<04:59, 1.54it/s, loss=0.1, lr=1] \nSteps: 69%|██████▉ | 1039/1500 [11:13<04:58, 1.54it/s, loss=0.1, lr=1]\nSteps: 69%|██████▉ | 1039/1500 [11:13<04:58, 1.54it/s, loss=0.106, lr=1]\nSteps: 69%|██████▉ | 1040/1500 [11:13<04:57, 1.55it/s, loss=0.106, lr=1]\nSteps: 69%|██████▉ | 1040/1500 [11:14<04:57, 1.55it/s, loss=0.144, lr=1]\nSteps: 69%|██████▉ | 1041/1500 [11:14<04:58, 1.54it/s, loss=0.144, lr=1]\nSteps: 69%|██████▉ | 1041/1500 [11:14<04:58, 1.54it/s, loss=0.218, lr=1]\nSteps: 69%|██████▉ | 1042/1500 [11:15<04:57, 1.54it/s, loss=0.218, lr=1]\nSteps: 69%|██████▉ | 1042/1500 [11:15<04:57, 1.54it/s, loss=0.0844, lr=1]\nSteps: 70%|██████▉ | 1043/1500 [11:15<04:55, 1.54it/s, loss=0.0844, lr=1]\nSteps: 70%|██████▉ | 1043/1500 [11:15<04:55, 1.54it/s, loss=0.13, lr=1] \nSteps: 70%|██████▉ | 1044/1500 [11:16<04:54, 1.55it/s, loss=0.13, lr=1]\nSteps: 70%|██████▉ | 1044/1500 [11:16<04:54, 1.55it/s, loss=0.13, lr=1]\nSteps: 70%|██████▉ | 1045/1500 [11:17<04:54, 1.55it/s, loss=0.13, lr=1]\nSteps: 70%|██████▉ | 1045/1500 [11:17<04:54, 1.55it/s, loss=0.187, lr=1]\nSteps: 70%|██████▉ | 1046/1500 [11:17<04:53, 1.55it/s, loss=0.187, lr=1]\nSteps: 70%|██████▉ | 1046/1500 [11:17<04:53, 1.55it/s, loss=0.149, lr=1]\nSteps: 70%|██████▉ | 1047/1500 [11:18<04:52, 1.55it/s, loss=0.149, lr=1]\nSteps: 70%|██████▉ | 1047/1500 [11:18<04:52, 1.55it/s, loss=0.124, lr=1]\nSteps: 70%|██████▉ | 1048/1500 [11:19<04:51, 1.55it/s, loss=0.124, lr=1]\nSteps: 70%|██████▉ | 1048/1500 [11:19<04:51, 1.55it/s, loss=0.0894, lr=1]\nSteps: 70%|██████▉ | 1049/1500 [11:19<04:51, 1.55it/s, loss=0.0894, lr=1]\nSteps: 70%|██████▉ | 1049/1500 [11:19<04:51, 1.55it/s, loss=0.117, lr=1] \nSteps: 70%|███████ | 1050/1500 [11:20<04:50, 1.55it/s, loss=0.117, lr=1]\nSteps: 70%|███████ | 1050/1500 [11:20<04:50, 1.55it/s, loss=0.125, lr=1]\nSteps: 70%|███████ | 1051/1500 [11:21<04:50, 1.55it/s, loss=0.125, lr=1]\nSteps: 70%|███████ | 1051/1500 [11:21<04:50, 1.55it/s, loss=0.0965, lr=1]\nSteps: 70%|███████ | 1052/1500 [11:21<04:49, 1.55it/s, loss=0.0965, lr=1]\nSteps: 70%|███████ | 1052/1500 [11:21<04:49, 1.55it/s, loss=0.0396, lr=1]\nSteps: 70%|███████ | 1053/1500 [11:22<04:49, 1.55it/s, loss=0.0396, lr=1]\nSteps: 70%|███████ | 1053/1500 [11:22<04:49, 1.55it/s, loss=0.102, lr=1] \nSteps: 70%|███████ | 1054/1500 [11:23<04:48, 1.55it/s, loss=0.102, lr=1]\nSteps: 70%|███████ | 1054/1500 [11:23<04:48, 1.55it/s, loss=0.27, lr=1] \nSteps: 70%|███████ | 1055/1500 [11:23<04:47, 1.55it/s, loss=0.27, lr=1]\nSteps: 70%|███████ | 1055/1500 [11:23<04:47, 1.55it/s, loss=0.119, lr=1]\nSteps: 70%|███████ | 1056/1500 [11:24<04:47, 1.54it/s, loss=0.119, lr=1]\nSteps: 70%|███████ | 1056/1500 [11:24<04:47, 1.54it/s, loss=0.154, lr=1]\nSteps: 70%|███████ | 1057/1500 [11:25<04:48, 1.54it/s, loss=0.154, lr=1]\nSteps: 70%|███████ | 1057/1500 [11:25<04:48, 1.54it/s, loss=0.0516, lr=1]\nSteps: 71%|███████ | 1058/1500 [11:25<04:46, 1.54it/s, loss=0.0516, lr=1]\nSteps: 71%|███████ | 1058/1500 [11:25<04:46, 1.54it/s, loss=0.21, lr=1] \nSteps: 71%|███████ | 1059/1500 [11:26<04:45, 1.54it/s, loss=0.21, lr=1]\nSteps: 71%|███████ | 1059/1500 [11:26<04:45, 1.54it/s, loss=0.178, lr=1]\nSteps: 71%|███████ | 1060/1500 [11:26<04:44, 1.55it/s, loss=0.178, lr=1]\nSteps: 71%|███████ | 1060/1500 [11:26<04:44, 1.55it/s, loss=0.118, lr=1]\nSteps: 71%|███████ | 1061/1500 [11:27<04:43, 1.55it/s, loss=0.118, lr=1]\nSteps: 71%|███████ | 1061/1500 [11:27<04:43, 1.55it/s, loss=0.274, lr=1]\nSteps: 71%|███████ | 1062/1500 [11:28<04:42, 1.55it/s, loss=0.274, lr=1]\nSteps: 71%|███████ | 1062/1500 [11:28<04:42, 1.55it/s, loss=0.135, lr=1]\nSteps: 71%|███████ | 1063/1500 [11:28<04:42, 1.55it/s, loss=0.135, lr=1]\nSteps: 71%|███████ | 1063/1500 [11:28<04:42, 1.55it/s, loss=0.158, lr=1]\nSteps: 71%|███████ | 1064/1500 [11:29<04:41, 1.55it/s, loss=0.158, lr=1]\nSteps: 71%|███████ | 1064/1500 [11:29<04:41, 1.55it/s, loss=0.175, lr=1]\nSteps: 71%|███████ | 1065/1500 [11:30<04:40, 1.55it/s, loss=0.175, lr=1]\nSteps: 71%|███████ | 1065/1500 [11:30<04:40, 1.55it/s, loss=0.0599, lr=1]\nSteps: 71%|███████ | 1066/1500 [11:30<04:40, 1.55it/s, loss=0.0599, lr=1]\nSteps: 71%|███████ | 1066/1500 [11:30<04:40, 1.55it/s, loss=0.148, lr=1] \nSteps: 71%|███████ | 1067/1500 [11:31<04:39, 1.55it/s, loss=0.148, lr=1]\nSteps: 71%|███████ | 1067/1500 [11:31<04:39, 1.55it/s, loss=0.0743, lr=1]\nSteps: 71%|███████ | 1068/1500 [11:32<04:38, 1.55it/s, loss=0.0743, lr=1]\nSteps: 71%|███████ | 1068/1500 [11:32<04:38, 1.55it/s, loss=0.0792, lr=1]\nSteps: 71%|███████▏ | 1069/1500 [11:32<04:38, 1.55it/s, loss=0.0792, lr=1]\nSteps: 71%|███████▏ | 1069/1500 [11:32<04:38, 1.55it/s, loss=0.0823, lr=1]\nSteps: 71%|███████▏ | 1070/1500 [11:33<04:37, 1.55it/s, loss=0.0823, lr=1]\nSteps: 71%|███████▏ | 1070/1500 [11:33<04:37, 1.55it/s, loss=0.042, lr=1] \nSteps: 71%|███████▏ | 1071/1500 [11:34<04:36, 1.55it/s, loss=0.042, lr=1]\nSteps: 71%|███████▏ | 1071/1500 [11:34<04:36, 1.55it/s, loss=0.0881, lr=1]\nSteps: 71%|███████▏ | 1072/1500 [11:34<04:35, 1.55it/s, loss=0.0881, lr=1]\nSteps: 71%|███████▏ | 1072/1500 [11:34<04:35, 1.55it/s, loss=0.249, lr=1] \nSteps: 72%|███████▏ | 1073/1500 [11:35<04:37, 1.54it/s, loss=0.249, lr=1]\nSteps: 72%|███████▏ | 1073/1500 [11:35<04:37, 1.54it/s, loss=0.176, lr=1]\nSteps: 72%|███████▏ | 1074/1500 [11:35<04:35, 1.54it/s, loss=0.176, lr=1]\nSteps: 72%|███████▏ | 1074/1500 [11:35<04:35, 1.54it/s, loss=0.111, lr=1]\nSteps: 72%|███████▏ | 1075/1500 [11:36<04:34, 1.55it/s, loss=0.111, lr=1]\nSteps: 72%|███████▏ | 1075/1500 [11:36<04:34, 1.55it/s, loss=0.19, lr=1] \nSteps: 72%|███████▏ | 1076/1500 [11:37<04:34, 1.55it/s, loss=0.19, lr=1]\nSteps: 72%|███████▏ | 1076/1500 [11:37<04:34, 1.55it/s, loss=0.118, lr=1]\nSteps: 72%|███████▏ | 1077/1500 [11:37<04:33, 1.55it/s, loss=0.118, lr=1]\nSteps: 72%|███████▏ | 1077/1500 [11:37<04:33, 1.55it/s, loss=0.113, lr=1]\nSteps: 72%|███████▏ | 1078/1500 [11:38<04:32, 1.55it/s, loss=0.113, lr=1]\nSteps: 72%|███████▏ | 1078/1500 [11:38<04:32, 1.55it/s, loss=0.0998, lr=1]\nSteps: 72%|███████▏ | 1079/1500 [11:39<04:31, 1.55it/s, loss=0.0998, lr=1]\nSteps: 72%|███████▏ | 1079/1500 [11:39<04:31, 1.55it/s, loss=0.093, lr=1] \nSteps: 72%|███████▏ | 1080/1500 [11:39<04:31, 1.55it/s, loss=0.093, lr=1]\nSteps: 72%|███████▏ | 1080/1500 [11:39<04:31, 1.55it/s, loss=0.123, lr=1]\nSteps: 72%|███████▏ | 1081/1500 [11:40<04:30, 1.55it/s, loss=0.123, lr=1]\nSteps: 72%|███████▏ | 1081/1500 [11:40<04:30, 1.55it/s, loss=0.129, lr=1]\nSteps: 72%|███████▏ | 1082/1500 [11:41<04:29, 1.55it/s, loss=0.129, lr=1]\nSteps: 72%|███████▏ | 1082/1500 [11:41<04:29, 1.55it/s, loss=0.0877, lr=1]\nSteps: 72%|███████▏ | 1083/1500 [11:41<04:29, 1.55it/s, loss=0.0877, lr=1]\nSteps: 72%|███████▏ | 1083/1500 [11:41<04:29, 1.55it/s, loss=0.135, lr=1] \nSteps: 72%|███████▏ | 1084/1500 [11:42<04:28, 1.55it/s, loss=0.135, lr=1]\nSteps: 72%|███████▏ | 1084/1500 [11:42<04:28, 1.55it/s, loss=0.191, lr=1]\nSteps: 72%|███████▏ | 1085/1500 [11:43<04:27, 1.55it/s, loss=0.191, lr=1]\nSteps: 72%|███████▏ | 1085/1500 [11:43<04:27, 1.55it/s, loss=0.0872, lr=1]\nSteps: 72%|███████▏ | 1086/1500 [11:43<04:26, 1.55it/s, loss=0.0872, lr=1]\nSteps: 72%|███████▏ | 1086/1500 [11:43<04:26, 1.55it/s, loss=0.158, lr=1] \nSteps: 72%|███████▏ | 1087/1500 [11:44<04:26, 1.55it/s, loss=0.158, lr=1]\nSteps: 72%|███████▏ | 1087/1500 [11:44<04:26, 1.55it/s, loss=0.167, lr=1]\nSteps: 73%|███████▎ | 1088/1500 [11:45<04:25, 1.55it/s, loss=0.167, lr=1]\nSteps: 73%|███████▎ | 1088/1500 [11:45<04:25, 1.55it/s, loss=0.142, lr=1]\nSteps: 73%|███████▎ | 1089/1500 [11:45<04:26, 1.54it/s, loss=0.142, lr=1]\nSteps: 73%|███████▎ | 1089/1500 [11:45<04:26, 1.54it/s, loss=0.144, lr=1]\nSteps: 73%|███████▎ | 1090/1500 [11:46<04:25, 1.54it/s, loss=0.144, lr=1]\nSteps: 73%|███████▎ | 1090/1500 [11:46<04:25, 1.54it/s, loss=0.175, lr=1]\nSteps: 73%|███████▎ | 1091/1500 [11:46<04:24, 1.55it/s, loss=0.175, lr=1]\nSteps: 73%|███████▎ | 1091/1500 [11:46<04:24, 1.55it/s, loss=0.167, lr=1]\nSteps: 73%|███████▎ | 1092/1500 [11:47<04:23, 1.55it/s, loss=0.167, lr=1]\nSteps: 73%|███████▎ | 1092/1500 [11:47<04:23, 1.55it/s, loss=0.203, lr=1]\nSteps: 73%|███████▎ | 1093/1500 [11:48<04:22, 1.55it/s, loss=0.203, lr=1]\nSteps: 73%|███████▎ | 1093/1500 [11:48<04:22, 1.55it/s, loss=0.05, lr=1] \nSteps: 73%|███████▎ | 1094/1500 [11:48<04:22, 1.55it/s, loss=0.05, lr=1]\nSteps: 73%|███████▎ | 1094/1500 [11:48<04:22, 1.55it/s, loss=0.124, lr=1]\nSteps: 73%|███████▎ | 1095/1500 [11:49<04:21, 1.55it/s, loss=0.124, lr=1]\nSteps: 73%|███████▎ | 1095/1500 [11:49<04:21, 1.55it/s, loss=0.0726, lr=1]\nSteps: 73%|███████▎ | 1096/1500 [11:50<04:20, 1.55it/s, loss=0.0726, lr=1]\nSteps: 73%|███████▎ | 1096/1500 [11:50<04:20, 1.55it/s, loss=0.117, lr=1] \nSteps: 73%|███████▎ | 1097/1500 [11:50<04:20, 1.55it/s, loss=0.117, lr=1]\nSteps: 73%|███████▎ | 1097/1500 [11:50<04:20, 1.55it/s, loss=0.171, lr=1]\nSteps: 73%|███████▎ | 1098/1500 [11:51<04:19, 1.55it/s, loss=0.171, lr=1]\nSteps: 73%|███████▎ | 1098/1500 [11:51<04:19, 1.55it/s, loss=0.19, lr=1] \nSteps: 73%|███████▎ | 1099/1500 [11:52<04:18, 1.55it/s, loss=0.19, lr=1]\nSteps: 73%|███████▎ | 1099/1500 [11:52<04:18, 1.55it/s, loss=0.0613, lr=1]\nSteps: 73%|███████▎ | 1100/1500 [11:52<04:18, 1.55it/s, loss=0.0613, lr=1]\nSteps: 73%|███████▎ | 1100/1500 [11:52<04:18, 1.55it/s, loss=0.132, lr=1] \nSteps: 73%|███████▎ | 1101/1500 [11:53<04:17, 1.55it/s, loss=0.132, lr=1]\nSteps: 73%|███████▎ | 1101/1500 [11:53<04:17, 1.55it/s, loss=0.0821, lr=1]\nSteps: 73%|███████▎ | 1102/1500 [11:54<04:16, 1.55it/s, loss=0.0821, lr=1]\nSteps: 73%|███████▎ | 1102/1500 [11:54<04:16, 1.55it/s, loss=0.152, lr=1] \nSteps: 74%|███████▎ | 1103/1500 [11:54<04:16, 1.55it/s, loss=0.152, lr=1]\nSteps: 74%|███████▎ | 1103/1500 [11:54<04:16, 1.55it/s, loss=0.138, lr=1]\nSteps: 74%|███████▎ | 1104/1500 [11:55<04:15, 1.55it/s, loss=0.138, lr=1]\nSteps: 74%|███████▎ | 1104/1500 [11:55<04:15, 1.55it/s, loss=0.274, lr=1]\nSteps: 74%|███████▎ | 1105/1500 [11:56<04:16, 1.54it/s, loss=0.274, lr=1]\nSteps: 74%|███████▎ | 1105/1500 [11:56<04:16, 1.54it/s, loss=0.091, lr=1]\nSteps: 74%|███████▎ | 1106/1500 [11:56<04:15, 1.54it/s, loss=0.091, lr=1]\nSteps: 74%|███████▎ | 1106/1500 [11:56<04:15, 1.54it/s, loss=0.0875, lr=1]\nSteps: 74%|███████▍ | 1107/1500 [11:57<04:14, 1.55it/s, loss=0.0875, lr=1]\nSteps: 74%|███████▍ | 1107/1500 [11:57<04:14, 1.55it/s, loss=0.203, lr=1] \nSteps: 74%|███████▍ | 1108/1500 [11:57<04:13, 1.55it/s, loss=0.203, lr=1]\nSteps: 74%|███████▍ | 1108/1500 [11:57<04:13, 1.55it/s, loss=0.0384, lr=1]\nSteps: 74%|███████▍ | 1109/1500 [11:58<04:12, 1.55it/s, loss=0.0384, lr=1]\nSteps: 74%|███████▍ | 1109/1500 [11:58<04:12, 1.55it/s, loss=0.137, lr=1] \nSteps: 74%|███████▍ | 1110/1500 [11:59<04:11, 1.55it/s, loss=0.137, lr=1]\nSteps: 74%|███████▍ | 1110/1500 [11:59<04:11, 1.55it/s, loss=0.18, lr=1] \nSteps: 74%|███████▍ | 1111/1500 [11:59<04:11, 1.55it/s, loss=0.18, lr=1]\nSteps: 74%|███████▍ | 1111/1500 [11:59<04:11, 1.55it/s, loss=0.186, lr=1]\nSteps: 74%|███████▍ | 1112/1500 [12:00<04:10, 1.55it/s, loss=0.186, lr=1]\nSteps: 74%|███████▍ | 1112/1500 [12:00<04:10, 1.55it/s, loss=0.0793, lr=1]\nSteps: 74%|███████▍ | 1113/1500 [12:01<04:09, 1.55it/s, loss=0.0793, lr=1]\nSteps: 74%|███████▍ | 1113/1500 [12:01<04:09, 1.55it/s, loss=0.136, lr=1] \nSteps: 74%|███████▍ | 1114/1500 [12:01<04:09, 1.55it/s, loss=0.136, lr=1]\nSteps: 74%|███████▍ | 1114/1500 [12:01<04:09, 1.55it/s, loss=0.149, lr=1]\nSteps: 74%|███████▍ | 1115/1500 [12:02<04:08, 1.55it/s, loss=0.149, lr=1]\nSteps: 74%|███████▍ | 1115/1500 [12:02<04:08, 1.55it/s, loss=0.122, lr=1]\nSteps: 74%|███████▍ | 1116/1500 [12:03<04:07, 1.55it/s, loss=0.122, lr=1]\nSteps: 74%|███████▍ | 1116/1500 [12:03<04:07, 1.55it/s, loss=0.152, lr=1]\nSteps: 74%|███████▍ | 1117/1500 [12:03<04:07, 1.55it/s, loss=0.152, lr=1]\nSteps: 74%|███████▍ | 1117/1500 [12:03<04:07, 1.55it/s, loss=0.0338, lr=1]\nSteps: 75%|███████▍ | 1118/1500 [12:04<04:07, 1.54it/s, loss=0.0338, lr=1]\nSteps: 75%|███████▍ | 1118/1500 [12:04<04:07, 1.54it/s, loss=0.0932, lr=1]\nSteps: 75%|███████▍ | 1119/1500 [12:05<04:06, 1.55it/s, loss=0.0932, lr=1]\nSteps: 75%|███████▍ | 1119/1500 [12:05<04:06, 1.55it/s, loss=0.164, lr=1] \nSteps: 75%|███████▍ | 1120/1500 [12:05<04:05, 1.55it/s, loss=0.164, lr=1]\nSteps: 75%|███████▍ | 1120/1500 [12:05<04:05, 1.55it/s, loss=0.0811, lr=1]\nSteps: 75%|███████▍ | 1121/1500 [12:06<04:06, 1.54it/s, loss=0.0811, lr=1]\nSteps: 75%|███████▍ | 1121/1500 [12:06<04:06, 1.54it/s, loss=0.104, lr=1] \nSteps: 75%|███████▍ | 1122/1500 [12:06<04:05, 1.54it/s, loss=0.104, lr=1]\nSteps: 75%|███████▍ | 1122/1500 [12:06<04:05, 1.54it/s, loss=0.125, lr=1]\nSteps: 75%|███████▍ | 1123/1500 [12:07<04:04, 1.54it/s, loss=0.125, lr=1]\nSteps: 75%|███████▍ | 1123/1500 [12:07<04:04, 1.54it/s, loss=0.131, lr=1]\nSteps: 75%|███████▍ | 1124/1500 [12:08<04:03, 1.55it/s, loss=0.131, lr=1]\nSteps: 75%|███████▍ | 1124/1500 [12:08<04:03, 1.55it/s, loss=0.123, lr=1]\nSteps: 75%|███████▌ | 1125/1500 [12:08<04:02, 1.55it/s, loss=0.123, lr=1]\nSteps: 75%|███████▌ | 1125/1500 [12:08<04:02, 1.55it/s, loss=0.0789, lr=1]\nSteps: 75%|███████▌ | 1126/1500 [12:09<04:01, 1.55it/s, loss=0.0789, lr=1]\nSteps: 75%|███████▌ | 1126/1500 [12:09<04:01, 1.55it/s, loss=0.101, lr=1] \nSteps: 75%|███████▌ | 1127/1500 [12:10<04:00, 1.55it/s, loss=0.101, lr=1]\nSteps: 75%|███████▌ | 1127/1500 [12:10<04:00, 1.55it/s, loss=0.165, lr=1]\nSteps: 75%|███████▌ | 1128/1500 [12:10<04:00, 1.55it/s, loss=0.165, lr=1]\nSteps: 75%|███████▌ | 1128/1500 [12:10<04:00, 1.55it/s, loss=0.0926, lr=1]\nSteps: 75%|███████▌ | 1129/1500 [12:11<03:59, 1.55it/s, loss=0.0926, lr=1]\nSteps: 75%|███████▌ | 1129/1500 [12:11<03:59, 1.55it/s, loss=0.0838, lr=1]\nSteps: 75%|███████▌ | 1130/1500 [12:12<03:58, 1.55it/s, loss=0.0838, lr=1]\nSteps: 75%|███████▌ | 1130/1500 [12:12<03:58, 1.55it/s, loss=0.103, lr=1] \nSteps: 75%|███████▌ | 1131/1500 [12:12<03:58, 1.55it/s, loss=0.103, lr=1]\nSteps: 75%|███████▌ | 1131/1500 [12:12<03:58, 1.55it/s, loss=0.117, lr=1]\nSteps: 75%|███████▌ | 1132/1500 [12:13<03:57, 1.55it/s, loss=0.117, lr=1]\nSteps: 75%|███████▌ | 1132/1500 [12:13<03:57, 1.55it/s, loss=0.115, lr=1]\nSteps: 76%|███████▌ | 1133/1500 [12:14<03:56, 1.55it/s, loss=0.115, lr=1]\nSteps: 76%|███████▌ | 1133/1500 [12:14<03:56, 1.55it/s, loss=0.32, lr=1] \nSteps: 76%|███████▌ | 1134/1500 [12:14<03:56, 1.55it/s, loss=0.32, lr=1]\nSteps: 76%|███████▌ | 1134/1500 [12:14<03:56, 1.55it/s, loss=0.155, lr=1]\nSteps: 76%|███████▌ | 1135/1500 [12:15<03:55, 1.55it/s, loss=0.155, lr=1]\nSteps: 76%|███████▌ | 1135/1500 [12:15<03:55, 1.55it/s, loss=0.21, lr=1] \nSteps: 76%|███████▌ | 1136/1500 [12:16<03:54, 1.55it/s, loss=0.21, lr=1]\nSteps: 76%|███████▌ | 1136/1500 [12:16<03:54, 1.55it/s, loss=0.172, lr=1]\nSteps: 76%|███████▌ | 1137/1500 [12:16<03:55, 1.54it/s, loss=0.172, lr=1]\nSteps: 76%|███████▌ | 1137/1500 [12:16<03:55, 1.54it/s, loss=0.156, lr=1]\nSteps: 76%|███████▌ | 1138/1500 [12:17<03:54, 1.54it/s, loss=0.156, lr=1]\nSteps: 76%|███████▌ | 1138/1500 [12:17<03:54, 1.54it/s, loss=0.122, lr=1]\nSteps: 76%|███████▌ | 1139/1500 [12:17<03:53, 1.55it/s, loss=0.122, lr=1]\nSteps: 76%|███████▌ | 1139/1500 [12:17<03:53, 1.55it/s, loss=0.137, lr=1]\nSteps: 76%|███████▌ | 1140/1500 [12:18<03:52, 1.55it/s, loss=0.137, lr=1]\nSteps: 76%|███████▌ | 1140/1500 [12:18<03:52, 1.55it/s, loss=0.0809, lr=1]\nSteps: 76%|███████▌ | 1141/1500 [12:19<03:51, 1.55it/s, loss=0.0809, lr=1]\nSteps: 76%|███████▌ | 1141/1500 [12:19<03:51, 1.55it/s, loss=0.109, lr=1] \nSteps: 76%|███████▌ | 1142/1500 [12:19<03:51, 1.55it/s, loss=0.109, lr=1]\nSteps: 76%|███████▌ | 1142/1500 [12:19<03:51, 1.55it/s, loss=0.054, lr=1]\nSteps: 76%|███████▌ | 1143/1500 [12:20<03:50, 1.55it/s, loss=0.054, lr=1]\nSteps: 76%|███████▌ | 1143/1500 [12:20<03:50, 1.55it/s, loss=0.0625, lr=1]\nSteps: 76%|███████▋ | 1144/1500 [12:21<03:49, 1.55it/s, loss=0.0625, lr=1]\nSteps: 76%|███████▋ | 1144/1500 [12:21<03:49, 1.55it/s, loss=0.118, lr=1] \nSteps: 76%|███████▋ | 1145/1500 [12:21<03:49, 1.55it/s, loss=0.118, lr=1]\nSteps: 76%|███████▋ | 1145/1500 [12:21<03:49, 1.55it/s, loss=0.0942, lr=1]\nSteps: 76%|███████▋ | 1146/1500 [12:22<03:48, 1.55it/s, loss=0.0942, lr=1]\nSteps: 76%|███████▋ | 1146/1500 [12:22<03:48, 1.55it/s, loss=0.0837, lr=1]\nSteps: 76%|███████▋ | 1147/1500 [12:23<03:48, 1.55it/s, loss=0.0837, lr=1]\nSteps: 76%|███████▋ | 1147/1500 [12:23<03:48, 1.55it/s, loss=0.0677, lr=1]\nSteps: 77%|███████▋ | 1148/1500 [12:23<03:47, 1.55it/s, loss=0.0677, lr=1]\nSteps: 77%|███████▋ | 1148/1500 [12:23<03:47, 1.55it/s, loss=0.197, lr=1] \nSteps: 77%|███████▋ | 1149/1500 [12:24<03:46, 1.55it/s, loss=0.197, lr=1]\nSteps: 77%|███████▋ | 1149/1500 [12:24<03:46, 1.55it/s, loss=0.229, lr=1]\nSteps: 77%|███████▋ | 1150/1500 [12:25<03:46, 1.55it/s, loss=0.229, lr=1]\nSteps: 77%|███████▋ | 1150/1500 [12:25<03:46, 1.55it/s, loss=0.148, lr=1]\nSteps: 77%|███████▋ | 1151/1500 [12:25<03:45, 1.55it/s, loss=0.148, lr=1]\nSteps: 77%|███████▋ | 1151/1500 [12:25<03:45, 1.55it/s, loss=0.0846, lr=1]\nSteps: 77%|███████▋ | 1152/1500 [12:26<03:44, 1.55it/s, loss=0.0846, lr=1]\nSteps: 77%|███████▋ | 1152/1500 [12:26<03:44, 1.55it/s, loss=0.0456, lr=1]\nSteps: 77%|███████▋ | 1153/1500 [12:27<03:45, 1.54it/s, loss=0.0456, lr=1]\nSteps: 77%|███████▋ | 1153/1500 [12:27<03:45, 1.54it/s, loss=0.0381, lr=1]\nSteps: 77%|███████▋ | 1154/1500 [12:27<03:44, 1.54it/s, loss=0.0381, lr=1]\nSteps: 77%|███████▋ | 1154/1500 [12:27<03:44, 1.54it/s, loss=0.0959, lr=1]\nSteps: 77%|███████▋ | 1155/1500 [12:28<03:43, 1.54it/s, loss=0.0959, lr=1]\nSteps: 77%|███████▋ | 1155/1500 [12:28<03:43, 1.54it/s, loss=0.251, lr=1] \nSteps: 77%|███████▋ | 1156/1500 [12:28<03:42, 1.55it/s, loss=0.251, lr=1]\nSteps: 77%|███████▋ | 1156/1500 [12:28<03:42, 1.55it/s, loss=0.0685, lr=1]\nSteps: 77%|███████▋ | 1157/1500 [12:29<03:41, 1.55it/s, loss=0.0685, lr=1]\nSteps: 77%|███████▋ | 1157/1500 [12:29<03:41, 1.55it/s, loss=0.0704, lr=1]\nSteps: 77%|███████▋ | 1158/1500 [12:30<03:40, 1.55it/s, loss=0.0704, lr=1]\nSteps: 77%|███████▋ | 1158/1500 [12:30<03:40, 1.55it/s, loss=0.124, lr=1] \nSteps: 77%|███████▋ | 1159/1500 [12:30<03:40, 1.55it/s, loss=0.124, lr=1]\nSteps: 77%|███████▋ | 1159/1500 [12:30<03:40, 1.55it/s, loss=0.0207, lr=1]\nSteps: 77%|███████▋ | 1160/1500 [12:31<03:39, 1.55it/s, loss=0.0207, lr=1]\nSteps: 77%|███████▋ | 1160/1500 [12:31<03:39, 1.55it/s, loss=0.167, lr=1] \nSteps: 77%|███████▋ | 1161/1500 [12:32<03:38, 1.55it/s, loss=0.167, lr=1]\nSteps: 77%|███████▋ | 1161/1500 [12:32<03:38, 1.55it/s, loss=0.0681, lr=1]\nSteps: 77%|███████▋ | 1162/1500 [12:32<03:38, 1.55it/s, loss=0.0681, lr=1]\nSteps: 77%|███████▋ | 1162/1500 [12:32<03:38, 1.55it/s, loss=0.0589, lr=1]\nSteps: 78%|███████▊ | 1163/1500 [12:33<03:37, 1.55it/s, loss=0.0589, lr=1]\nSteps: 78%|███████▊ | 1163/1500 [12:33<03:37, 1.55it/s, loss=0.108, lr=1] \nSteps: 78%|███████▊ | 1164/1500 [12:34<03:36, 1.55it/s, loss=0.108, lr=1]\nSteps: 78%|███████▊ | 1164/1500 [12:34<03:36, 1.55it/s, loss=0.29, lr=1] \nSteps: 78%|███████▊ | 1165/1500 [12:34<03:36, 1.54it/s, loss=0.29, lr=1]\nSteps: 78%|███████▊ | 1165/1500 [12:34<03:36, 1.54it/s, loss=0.122, lr=1]\nSteps: 78%|███████▊ | 1166/1500 [12:35<03:36, 1.55it/s, loss=0.122, lr=1]\nSteps: 78%|███████▊ | 1166/1500 [12:35<03:36, 1.55it/s, loss=0.0519, lr=1]\nSteps: 78%|███████▊ | 1167/1500 [12:36<03:35, 1.55it/s, loss=0.0519, lr=1]\nSteps: 78%|███████▊ | 1167/1500 [12:36<03:35, 1.55it/s, loss=0.1, lr=1] \nSteps: 78%|███████▊ | 1168/1500 [12:36<03:34, 1.55it/s, loss=0.1, lr=1]\nSteps: 78%|███████▊ | 1168/1500 [12:36<03:34, 1.55it/s, loss=0.132, lr=1]\nSteps: 78%|███████▊ | 1169/1500 [12:37<03:36, 1.53it/s, loss=0.132, lr=1]\nSteps: 78%|███████▊ | 1169/1500 [12:37<03:36, 1.53it/s, loss=0.121, lr=1]\nSteps: 78%|███████▊ | 1170/1500 [12:38<03:35, 1.53it/s, loss=0.121, lr=1]\nSteps: 78%|███████▊ | 1170/1500 [12:38<03:35, 1.53it/s, loss=0.306, lr=1]\nSteps: 78%|███████▊ | 1171/1500 [12:38<03:34, 1.53it/s, loss=0.306, lr=1]\nSteps: 78%|███████▊ | 1171/1500 [12:38<03:34, 1.53it/s, loss=0.0996, lr=1]\nSteps: 78%|███████▊ | 1172/1500 [12:39<03:33, 1.54it/s, loss=0.0996, lr=1]\nSteps: 78%|███████▊ | 1172/1500 [12:39<03:33, 1.54it/s, loss=0.147, lr=1] \nSteps: 78%|███████▊ | 1173/1500 [12:39<03:32, 1.54it/s, loss=0.147, lr=1]\nSteps: 78%|███████▊ | 1173/1500 [12:39<03:32, 1.54it/s, loss=0.134, lr=1]\nSteps: 78%|███████▊ | 1174/1500 [12:40<03:31, 1.54it/s, loss=0.134, lr=1]\nSteps: 78%|███████▊ | 1174/1500 [12:40<03:31, 1.54it/s, loss=0.174, lr=1]\nSteps: 78%|███████▊ | 1175/1500 [12:41<03:30, 1.54it/s, loss=0.174, lr=1]\nSteps: 78%|███████▊ | 1175/1500 [12:41<03:30, 1.54it/s, loss=0.106, lr=1]\nSteps: 78%|███████▊ | 1176/1500 [12:41<03:29, 1.54it/s, loss=0.106, lr=1]\nSteps: 78%|███████▊ | 1176/1500 [12:41<03:29, 1.54it/s, loss=0.148, lr=1]\nSteps: 78%|███████▊ | 1177/1500 [12:42<03:29, 1.55it/s, loss=0.148, lr=1]\nSteps: 78%|███████▊ | 1177/1500 [12:42<03:29, 1.55it/s, loss=0.0922, lr=1]\nSteps: 79%|███████▊ | 1178/1500 [12:43<03:28, 1.54it/s, loss=0.0922, lr=1]\nSteps: 79%|███████▊ | 1178/1500 [12:43<03:28, 1.54it/s, loss=0.194, lr=1] \nSteps: 79%|███████▊ | 1179/1500 [12:43<03:27, 1.54it/s, loss=0.194, lr=1]\nSteps: 79%|███████▊ | 1179/1500 [12:43<03:27, 1.54it/s, loss=0.135, lr=1]\nSteps: 79%|███████▊ | 1180/1500 [12:44<03:39, 1.46it/s, loss=0.135, lr=1]\nSteps: 79%|███████▊ | 1180/1500 [12:44<03:39, 1.46it/s, loss=0.0799, lr=1]\nSteps: 79%|███████▊ | 1181/1500 [12:45<03:34, 1.48it/s, loss=0.0799, lr=1]\nSteps: 79%|███████▊ | 1181/1500 [12:45<03:34, 1.48it/s, loss=0.225, lr=1] \nSteps: 79%|███████▉ | 1182/1500 [12:45<03:31, 1.50it/s, loss=0.225, lr=1]\nSteps: 79%|███████▉ | 1182/1500 [12:45<03:31, 1.50it/s, loss=0.0779, lr=1]\nSteps: 79%|███████▉ | 1183/1500 [12:46<03:28, 1.52it/s, loss=0.0779, lr=1]\nSteps: 79%|███████▉ | 1183/1500 [12:46<03:28, 1.52it/s, loss=0.184, lr=1] \nSteps: 79%|███████▉ | 1184/1500 [12:47<03:26, 1.53it/s, loss=0.184, lr=1]\nSteps: 79%|███████▉ | 1184/1500 [12:47<03:26, 1.53it/s, loss=0.0466, lr=1]\nSteps: 79%|███████▉ | 1185/1500 [12:47<03:26, 1.53it/s, loss=0.0466, lr=1]\nSteps: 79%|███████▉ | 1185/1500 [12:47<03:26, 1.53it/s, loss=0.239, lr=1] \nSteps: 79%|███████▉ | 1186/1500 [12:48<03:24, 1.53it/s, loss=0.239, lr=1]\nSteps: 79%|███████▉ | 1186/1500 [12:48<03:24, 1.53it/s, loss=0.115, lr=1]\nSteps: 79%|███████▉ | 1187/1500 [12:49<03:23, 1.54it/s, loss=0.115, lr=1]\nSteps: 79%|███████▉ | 1187/1500 [12:49<03:23, 1.54it/s, loss=0.112, lr=1]\nSteps: 79%|███████▉ | 1188/1500 [12:49<03:22, 1.54it/s, loss=0.112, lr=1]\nSteps: 79%|███████▉ | 1188/1500 [12:49<03:22, 1.54it/s, loss=0.0711, lr=1]\nSteps: 79%|███████▉ | 1189/1500 [12:50<03:21, 1.54it/s, loss=0.0711, lr=1]\nSteps: 79%|███████▉ | 1189/1500 [12:50<03:21, 1.54it/s, loss=0.0877, lr=1]\nSteps: 79%|███████▉ | 1190/1500 [12:51<03:20, 1.54it/s, loss=0.0877, lr=1]\nSteps: 79%|███████▉ | 1190/1500 [12:51<03:20, 1.54it/s, loss=0.0816, lr=1]\nSteps: 79%|███████▉ | 1191/1500 [12:51<03:19, 1.55it/s, loss=0.0816, lr=1]\nSteps: 79%|███████▉ | 1191/1500 [12:51<03:19, 1.55it/s, loss=0.0971, lr=1]\nSteps: 79%|███████▉ | 1192/1500 [12:52<03:19, 1.55it/s, loss=0.0971, lr=1]\nSteps: 79%|███████▉ | 1192/1500 [12:52<03:19, 1.55it/s, loss=0.171, lr=1] \nSteps: 80%|███████▉ | 1193/1500 [12:53<03:18, 1.55it/s, loss=0.171, lr=1]\nSteps: 80%|███████▉ | 1193/1500 [12:53<03:18, 1.55it/s, loss=0.338, lr=1]\nSteps: 80%|███████▉ | 1194/1500 [12:53<03:17, 1.55it/s, loss=0.338, lr=1]\nSteps: 80%|███████▉ | 1194/1500 [12:53<03:17, 1.55it/s, loss=0.146, lr=1]\nSteps: 80%|███████▉ | 1195/1500 [12:54<03:17, 1.55it/s, loss=0.146, lr=1]\nSteps: 80%|███████▉ | 1195/1500 [12:54<03:17, 1.55it/s, loss=0.218, lr=1]\nSteps: 80%|███████▉ | 1196/1500 [12:54<03:16, 1.55it/s, loss=0.218, lr=1]\nSteps: 80%|███████▉ | 1196/1500 [12:54<03:16, 1.55it/s, loss=0.0801, lr=1]\nSteps: 80%|███████▉ | 1197/1500 [12:55<03:16, 1.54it/s, loss=0.0801, lr=1]\nSteps: 80%|███████▉ | 1197/1500 [12:55<03:16, 1.54it/s, loss=0.0925, lr=1]\nSteps: 80%|███████▉ | 1198/1500 [12:56<03:15, 1.54it/s, loss=0.0925, lr=1]\nSteps: 80%|███████▉ | 1198/1500 [12:56<03:15, 1.54it/s, loss=0.152, lr=1] \nSteps: 80%|███████▉ | 1199/1500 [12:56<03:14, 1.54it/s, loss=0.152, lr=1]\nSteps: 80%|███████▉ | 1199/1500 [12:56<03:14, 1.54it/s, loss=0.138, lr=1]\nSteps: 80%|████████ | 1200/1500 [12:57<03:14, 1.55it/s, loss=0.138, lr=1]\nSteps: 80%|████████ | 1200/1500 [12:57<03:14, 1.55it/s, loss=0.146, lr=1]\nSteps: 80%|████████ | 1201/1500 [12:58<03:14, 1.54it/s, loss=0.146, lr=1]\nSteps: 80%|████████ | 1201/1500 [12:58<03:14, 1.54it/s, loss=0.0313, lr=1]\nSteps: 80%|████████ | 1202/1500 [12:58<03:13, 1.54it/s, loss=0.0313, lr=1]\nSteps: 80%|████████ | 1202/1500 [12:58<03:13, 1.54it/s, loss=0.0984, lr=1]\nSteps: 80%|████████ | 1203/1500 [12:59<03:12, 1.54it/s, loss=0.0984, lr=1]\nSteps: 80%|████████ | 1203/1500 [12:59<03:12, 1.54it/s, loss=0.0619, lr=1]\nSteps: 80%|████████ | 1204/1500 [13:00<03:11, 1.54it/s, loss=0.0619, lr=1]\nSteps: 80%|████████ | 1204/1500 [13:00<03:11, 1.54it/s, loss=0.0622, lr=1]\nSteps: 80%|████████ | 1205/1500 [13:00<03:11, 1.54it/s, loss=0.0622, lr=1]\nSteps: 80%|████████ | 1205/1500 [13:00<03:11, 1.54it/s, loss=0.128, lr=1] \nSteps: 80%|████████ | 1206/1500 [13:01<03:10, 1.55it/s, loss=0.128, lr=1]\nSteps: 80%|████████ | 1206/1500 [13:01<03:10, 1.55it/s, loss=0.034, lr=1]\nSteps: 80%|████████ | 1207/1500 [13:02<03:09, 1.55it/s, loss=0.034, lr=1]\nSteps: 80%|████████ | 1207/1500 [13:02<03:09, 1.55it/s, loss=0.124, lr=1]\nSteps: 81%|████████ | 1208/1500 [13:02<03:08, 1.55it/s, loss=0.124, lr=1]\nSteps: 81%|████████ | 1208/1500 [13:02<03:08, 1.55it/s, loss=0.117, lr=1]\nSteps: 81%|████████ | 1209/1500 [13:03<03:07, 1.55it/s, loss=0.117, lr=1]\nSteps: 81%|████████ | 1209/1500 [13:03<03:07, 1.55it/s, loss=0.14, lr=1] \nSteps: 81%|████████ | 1210/1500 [13:04<03:07, 1.55it/s, loss=0.14, lr=1]\nSteps: 81%|████████ | 1210/1500 [13:04<03:07, 1.55it/s, loss=0.104, lr=1]\nSteps: 81%|████████ | 1211/1500 [13:04<03:06, 1.55it/s, loss=0.104, lr=1]\nSteps: 81%|████████ | 1211/1500 [13:04<03:06, 1.55it/s, loss=0.241, lr=1]\nSteps: 81%|████████ | 1212/1500 [13:05<03:06, 1.55it/s, loss=0.241, lr=1]\nSteps: 81%|████████ | 1212/1500 [13:05<03:06, 1.55it/s, loss=0.109, lr=1]\nSteps: 81%|████████ | 1213/1500 [13:05<03:05, 1.54it/s, loss=0.109, lr=1]\nSteps: 81%|████████ | 1213/1500 [13:05<03:05, 1.54it/s, loss=0.0809, lr=1]\nSteps: 81%|████████ | 1214/1500 [13:06<03:05, 1.54it/s, loss=0.0809, lr=1]\nSteps: 81%|████████ | 1214/1500 [13:06<03:05, 1.54it/s, loss=0.0724, lr=1]\nSteps: 81%|████████ | 1215/1500 [13:07<03:04, 1.54it/s, loss=0.0724, lr=1]\nSteps: 81%|████████ | 1215/1500 [13:07<03:04, 1.54it/s, loss=0.146, lr=1] \nSteps: 81%|████████ | 1216/1500 [13:07<03:03, 1.55it/s, loss=0.146, lr=1]\nSteps: 81%|████████ | 1216/1500 [13:07<03:03, 1.55it/s, loss=0.0946, lr=1]\nSteps: 81%|████████ | 1217/1500 [13:08<03:04, 1.54it/s, loss=0.0946, lr=1]\nSteps: 81%|████████ | 1217/1500 [13:08<03:04, 1.54it/s, loss=0.161, lr=1] \nSteps: 81%|████████ | 1218/1500 [13:09<03:03, 1.54it/s, loss=0.161, lr=1]\nSteps: 81%|████████ | 1218/1500 [13:09<03:03, 1.54it/s, loss=0.0411, lr=1]\nSteps: 81%|████████▏ | 1219/1500 [13:09<03:02, 1.54it/s, loss=0.0411, lr=1]\nSteps: 81%|████████▏ | 1219/1500 [13:09<03:02, 1.54it/s, loss=0.0622, lr=1]\nSteps: 81%|████████▏ | 1220/1500 [13:10<03:01, 1.54it/s, loss=0.0622, lr=1]\nSteps: 81%|████████▏ | 1220/1500 [13:10<03:01, 1.54it/s, loss=0.302, lr=1] \nSteps: 81%|████████▏ | 1221/1500 [13:11<03:00, 1.54it/s, loss=0.302, lr=1]\nSteps: 81%|████████▏ | 1221/1500 [13:11<03:00, 1.54it/s, loss=0.197, lr=1]\nSteps: 81%|████████▏ | 1222/1500 [13:11<03:00, 1.54it/s, loss=0.197, lr=1]\nSteps: 81%|████████▏ | 1222/1500 [13:11<03:00, 1.54it/s, loss=0.2, lr=1] \nSteps: 82%|████████▏ | 1223/1500 [13:12<02:59, 1.54it/s, loss=0.2, lr=1]\nSteps: 82%|████████▏ | 1223/1500 [13:12<02:59, 1.54it/s, loss=0.278, lr=1]\nSteps: 82%|████████▏ | 1224/1500 [13:13<02:58, 1.54it/s, loss=0.278, lr=1]\nSteps: 82%|████████▏ | 1224/1500 [13:13<02:58, 1.54it/s, loss=0.123, lr=1]\nSteps: 82%|████████▏ | 1225/1500 [13:13<02:58, 1.54it/s, loss=0.123, lr=1]\nSteps: 82%|████████▏ | 1225/1500 [13:13<02:58, 1.54it/s, loss=0.0647, lr=1]\nSteps: 82%|████████▏ | 1226/1500 [13:14<02:57, 1.54it/s, loss=0.0647, lr=1]\nSteps: 82%|████████▏ | 1226/1500 [13:14<02:57, 1.54it/s, loss=0.0565, lr=1]\nSteps: 82%|████████▏ | 1227/1500 [13:15<02:56, 1.55it/s, loss=0.0565, lr=1]\nSteps: 82%|████████▏ | 1227/1500 [13:15<02:56, 1.55it/s, loss=0.173, lr=1] \nSteps: 82%|████████▏ | 1228/1500 [13:15<02:55, 1.55it/s, loss=0.173, lr=1]\nSteps: 82%|████████▏ | 1228/1500 [13:15<02:55, 1.55it/s, loss=0.106, lr=1]\nSteps: 82%|████████▏ | 1229/1500 [13:16<02:55, 1.55it/s, loss=0.106, lr=1]\nSteps: 82%|████████▏ | 1229/1500 [13:16<02:55, 1.55it/s, loss=0.0923, lr=1]\nSteps: 82%|████████▏ | 1230/1500 [13:16<02:54, 1.55it/s, loss=0.0923, lr=1]\nSteps: 82%|████████▏ | 1230/1500 [13:16<02:54, 1.55it/s, loss=0.0842, lr=1]\nSteps: 82%|████████▏ | 1231/1500 [13:17<02:53, 1.55it/s, loss=0.0842, lr=1]\nSteps: 82%|████████▏ | 1231/1500 [13:17<02:53, 1.55it/s, loss=0.135, lr=1] \nSteps: 82%|████████▏ | 1232/1500 [13:18<02:52, 1.55it/s, loss=0.135, lr=1]\nSteps: 82%|████████▏ | 1232/1500 [13:18<02:52, 1.55it/s, loss=0.32, lr=1] \nSteps: 82%|████████▏ | 1233/1500 [13:18<02:53, 1.54it/s, loss=0.32, lr=1]\nSteps: 82%|████████▏ | 1233/1500 [13:18<02:53, 1.54it/s, loss=0.161, lr=1]\nSteps: 82%|████████▏ | 1234/1500 [13:19<02:52, 1.54it/s, loss=0.161, lr=1]\nSteps: 82%|████████▏ | 1234/1500 [13:19<02:52, 1.54it/s, loss=0.123, lr=1]\nSteps: 82%|████████▏ | 1235/1500 [13:20<02:51, 1.55it/s, loss=0.123, lr=1]\nSteps: 82%|████████▏ | 1235/1500 [13:20<02:51, 1.55it/s, loss=0.276, lr=1]\nSteps: 82%|████████▏ | 1236/1500 [13:20<02:50, 1.55it/s, loss=0.276, lr=1]\nSteps: 82%|████████▏ | 1236/1500 [13:20<02:50, 1.55it/s, loss=0.184, lr=1]\nSteps: 82%|████████▏ | 1237/1500 [13:21<02:49, 1.55it/s, loss=0.184, lr=1]\nSteps: 82%|████████▏ | 1237/1500 [13:21<02:49, 1.55it/s, loss=0.0454, lr=1]\nSteps: 83%|████████▎ | 1238/1500 [13:22<02:49, 1.55it/s, loss=0.0454, lr=1]\nSteps: 83%|████████▎ | 1238/1500 [13:22<02:49, 1.55it/s, loss=0.164, lr=1] \nSteps: 83%|████████▎ | 1239/1500 [13:22<02:48, 1.55it/s, loss=0.164, lr=1]\nSteps: 83%|████████▎ | 1239/1500 [13:22<02:48, 1.55it/s, loss=0.0809, lr=1]\nSteps: 83%|████████▎ | 1240/1500 [13:23<02:47, 1.55it/s, loss=0.0809, lr=1]\nSteps: 83%|████████▎ | 1240/1500 [13:23<02:47, 1.55it/s, loss=0.113, lr=1] \nSteps: 83%|████████▎ | 1241/1500 [13:24<02:47, 1.55it/s, loss=0.113, lr=1]\nSteps: 83%|████████▎ | 1241/1500 [13:24<02:47, 1.55it/s, loss=0.0706, lr=1]\nSteps: 83%|████████▎ | 1242/1500 [13:24<02:47, 1.54it/s, loss=0.0706, lr=1]\nSteps: 83%|████████▎ | 1242/1500 [13:24<02:47, 1.54it/s, loss=0.0901, lr=1]\nSteps: 83%|████████▎ | 1243/1500 [13:25<02:46, 1.54it/s, loss=0.0901, lr=1]\nSteps: 83%|████████▎ | 1243/1500 [13:25<02:46, 1.54it/s, loss=0.296, lr=1] \nSteps: 83%|████████▎ | 1244/1500 [13:26<02:45, 1.54it/s, loss=0.296, lr=1]\nSteps: 83%|████████▎ | 1244/1500 [13:26<02:45, 1.54it/s, loss=0.21, lr=1] \nSteps: 83%|████████▎ | 1245/1500 [13:26<02:45, 1.54it/s, loss=0.21, lr=1]\nSteps: 83%|████████▎ | 1245/1500 [13:26<02:45, 1.54it/s, loss=0.135, lr=1]\nSteps: 83%|████████▎ | 1246/1500 [13:27<02:44, 1.55it/s, loss=0.135, lr=1]\nSteps: 83%|████████▎ | 1246/1500 [13:27<02:44, 1.55it/s, loss=0.098, lr=1]\nSteps: 83%|████████▎ | 1247/1500 [13:27<02:43, 1.55it/s, loss=0.098, lr=1]\nSteps: 83%|████████▎ | 1247/1500 [13:27<02:43, 1.55it/s, loss=0.152, lr=1]\nSteps: 83%|████████▎ | 1248/1500 [13:28<02:42, 1.55it/s, loss=0.152, lr=1]\nSteps: 83%|████████▎ | 1248/1500 [13:28<02:42, 1.55it/s, loss=0.194, lr=1]\nSteps: 83%|████████▎ | 1249/1500 [13:29<02:42, 1.54it/s, loss=0.194, lr=1]\nSteps: 83%|████████▎ | 1249/1500 [13:29<02:42, 1.54it/s, loss=0.0293, lr=1]\nSteps: 83%|████████▎ | 1250/1500 [13:29<02:42, 1.54it/s, loss=0.0293, lr=1]\nSteps: 83%|████████▎ | 1250/1500 [13:29<02:42, 1.54it/s, loss=0.127, lr=1] \nSteps: 83%|████████▎ | 1251/1500 [13:30<02:41, 1.55it/s, loss=0.127, lr=1]\nSteps: 83%|████████▎ | 1251/1500 [13:30<02:41, 1.55it/s, loss=0.0657, lr=1]\nSteps: 83%|████████▎ | 1252/1500 [13:31<02:40, 1.55it/s, loss=0.0657, lr=1]\nSteps: 83%|████████▎ | 1252/1500 [13:31<02:40, 1.55it/s, loss=0.145, lr=1] \nSteps: 84%|████████▎ | 1253/1500 [13:31<02:39, 1.55it/s, loss=0.145, lr=1]\nSteps: 84%|████████▎ | 1253/1500 [13:31<02:39, 1.55it/s, loss=0.0928, lr=1]\nSteps: 84%|████████▎ | 1254/1500 [13:32<02:38, 1.55it/s, loss=0.0928, lr=1]\nSteps: 84%|████████▎ | 1254/1500 [13:32<02:38, 1.55it/s, loss=0.0796, lr=1]\nSteps: 84%|████████▎ | 1255/1500 [13:33<02:38, 1.55it/s, loss=0.0796, lr=1]\nSteps: 84%|████████▎ | 1255/1500 [13:33<02:38, 1.55it/s, loss=0.073, lr=1] \nSteps: 84%|████████▎ | 1256/1500 [13:33<02:37, 1.55it/s, loss=0.073, lr=1]\nSteps: 84%|████████▎ | 1256/1500 [13:33<02:37, 1.55it/s, loss=0.162, lr=1]\nSteps: 84%|████████▍ | 1257/1500 [13:34<02:36, 1.55it/s, loss=0.162, lr=1]\nSteps: 84%|████████▍ | 1257/1500 [13:34<02:36, 1.55it/s, loss=0.123, lr=1]\nSteps: 84%|████████▍ | 1258/1500 [13:35<02:36, 1.55it/s, loss=0.123, lr=1]\nSteps: 84%|████████▍ | 1258/1500 [13:35<02:36, 1.55it/s, loss=0.315, lr=1]\nSteps: 84%|████████▍ | 1259/1500 [13:35<02:35, 1.55it/s, loss=0.315, lr=1]\nSteps: 84%|████████▍ | 1259/1500 [13:35<02:35, 1.55it/s, loss=0.156, lr=1]\nSteps: 84%|████████▍ | 1260/1500 [13:36<02:34, 1.55it/s, loss=0.156, lr=1]\nSteps: 84%|████████▍ | 1260/1500 [13:36<02:34, 1.55it/s, loss=0.186, lr=1]\nSteps: 84%|████████▍ | 1261/1500 [13:37<02:34, 1.55it/s, loss=0.186, lr=1]\nSteps: 84%|████████▍ | 1261/1500 [13:37<02:34, 1.55it/s, loss=0.0822, lr=1]\nSteps: 84%|████████▍ | 1262/1500 [13:37<02:33, 1.55it/s, loss=0.0822, lr=1]\nSteps: 84%|████████▍ | 1262/1500 [13:37<02:33, 1.55it/s, loss=0.2, lr=1] \nSteps: 84%|████████▍ | 1263/1500 [13:38<02:32, 1.55it/s, loss=0.2, lr=1]\nSteps: 84%|████████▍ | 1263/1500 [13:38<02:32, 1.55it/s, loss=0.171, lr=1]\nSteps: 84%|████████▍ | 1264/1500 [13:38<02:32, 1.55it/s, loss=0.171, lr=1]\nSteps: 84%|████████▍ | 1264/1500 [13:38<02:32, 1.55it/s, loss=0.231, lr=1]\nSteps: 84%|████████▍ | 1265/1500 [13:39<02:32, 1.54it/s, loss=0.231, lr=1]\nSteps: 84%|████████▍ | 1265/1500 [13:39<02:32, 1.54it/s, loss=0.183, lr=1]\nSteps: 84%|████████▍ | 1266/1500 [13:40<02:31, 1.54it/s, loss=0.183, lr=1]\nSteps: 84%|████████▍ | 1266/1500 [13:40<02:31, 1.54it/s, loss=0.137, lr=1]\nSteps: 84%|████████▍ | 1267/1500 [13:40<02:30, 1.54it/s, loss=0.137, lr=1]\nSteps: 84%|████████▍ | 1267/1500 [13:40<02:30, 1.54it/s, loss=0.119, lr=1]\nSteps: 85%|████████▍ | 1268/1500 [13:41<02:30, 1.54it/s, loss=0.119, lr=1]\nSteps: 85%|████████▍ | 1268/1500 [13:41<02:30, 1.54it/s, loss=0.0888, lr=1]\nSteps: 85%|████████▍ | 1269/1500 [13:42<02:29, 1.54it/s, loss=0.0888, lr=1]\nSteps: 85%|████████▍ | 1269/1500 [13:42<02:29, 1.54it/s, loss=0.063, lr=1] \nSteps: 85%|████████▍ | 1270/1500 [13:42<02:28, 1.55it/s, loss=0.063, lr=1]\nSteps: 85%|████████▍ | 1270/1500 [13:42<02:28, 1.55it/s, loss=0.173, lr=1]\nSteps: 85%|████████▍ | 1271/1500 [13:43<02:27, 1.55it/s, loss=0.173, lr=1]\nSteps: 85%|████████▍ | 1271/1500 [13:43<02:27, 1.55it/s, loss=0.0739, lr=1]\nSteps: 85%|████████▍ | 1272/1500 [13:44<02:27, 1.55it/s, loss=0.0739, lr=1]\nSteps: 85%|████████▍ | 1272/1500 [13:44<02:27, 1.55it/s, loss=0.161, lr=1] \nSteps: 85%|████████▍ | 1273/1500 [13:44<02:26, 1.55it/s, loss=0.161, lr=1]\nSteps: 85%|████████▍ | 1273/1500 [13:44<02:26, 1.55it/s, loss=0.138, lr=1]\nSteps: 85%|████████▍ | 1274/1500 [13:45<02:26, 1.55it/s, loss=0.138, lr=1]\nSteps: 85%|████████▍ | 1274/1500 [13:45<02:26, 1.55it/s, loss=0.166, lr=1]\nSteps: 85%|████████▌ | 1275/1500 [13:46<02:25, 1.55it/s, loss=0.166, lr=1]\nSteps: 85%|████████▌ | 1275/1500 [13:46<02:25, 1.55it/s, loss=0.103, lr=1]\nSteps: 85%|████████▌ | 1276/1500 [13:46<02:24, 1.55it/s, loss=0.103, lr=1]\nSteps: 85%|████████▌ | 1276/1500 [13:46<02:24, 1.55it/s, loss=0.193, lr=1]\nSteps: 85%|████████▌ | 1277/1500 [13:47<02:23, 1.55it/s, loss=0.193, lr=1]\nSteps: 85%|████████▌ | 1277/1500 [13:47<02:23, 1.55it/s, loss=0.382, lr=1]\nSteps: 85%|████████▌ | 1278/1500 [13:48<02:23, 1.55it/s, loss=0.382, lr=1]\nSteps: 85%|████████▌ | 1278/1500 [13:48<02:23, 1.55it/s, loss=0.211, lr=1]\nSteps: 85%|████████▌ | 1279/1500 [13:48<02:22, 1.55it/s, loss=0.211, lr=1]\nSteps: 85%|████████▌ | 1279/1500 [13:48<02:22, 1.55it/s, loss=0.0598, lr=1]\nSteps: 85%|████████▌ | 1280/1500 [13:49<02:21, 1.55it/s, loss=0.0598, lr=1]\nSteps: 85%|████████▌ | 1280/1500 [13:49<02:21, 1.55it/s, loss=0.158, lr=1] \nSteps: 85%|████████▌ | 1281/1500 [13:49<02:22, 1.54it/s, loss=0.158, lr=1]\nSteps: 85%|████████▌ | 1281/1500 [13:49<02:22, 1.54it/s, loss=0.139, lr=1]\nSteps: 85%|████████▌ | 1282/1500 [13:50<02:21, 1.54it/s, loss=0.139, lr=1]\nSteps: 85%|████████▌ | 1282/1500 [13:50<02:21, 1.54it/s, loss=0.189, lr=1]\nSteps: 86%|████████▌ | 1283/1500 [13:51<02:20, 1.54it/s, loss=0.189, lr=1]\nSteps: 86%|████████▌ | 1283/1500 [13:51<02:20, 1.54it/s, loss=0.108, lr=1]\nSteps: 86%|████████▌ | 1284/1500 [13:51<02:19, 1.55it/s, loss=0.108, lr=1]\nSteps: 86%|████████▌ | 1284/1500 [13:51<02:19, 1.55it/s, loss=0.0843, lr=1]\nSteps: 86%|████████▌ | 1285/1500 [13:52<02:18, 1.55it/s, loss=0.0843, lr=1]\nSteps: 86%|████████▌ | 1285/1500 [13:52<02:18, 1.55it/s, loss=0.0307, lr=1]\nSteps: 86%|████████▌ | 1286/1500 [13:53<02:18, 1.55it/s, loss=0.0307, lr=1]\nSteps: 86%|████████▌ | 1286/1500 [13:53<02:18, 1.55it/s, loss=0.181, lr=1] \nSteps: 86%|████████▌ | 1287/1500 [13:53<02:17, 1.55it/s, loss=0.181, lr=1]\nSteps: 86%|████████▌ | 1287/1500 [13:53<02:17, 1.55it/s, loss=0.126, lr=1]\nSteps: 86%|████████▌ | 1288/1500 [13:54<02:17, 1.55it/s, loss=0.126, lr=1]\nSteps: 86%|████████▌ | 1288/1500 [13:54<02:17, 1.55it/s, loss=0.0757, lr=1]\nSteps: 86%|████████▌ | 1289/1500 [13:55<02:16, 1.55it/s, loss=0.0757, lr=1]\nSteps: 86%|████████▌ | 1289/1500 [13:55<02:16, 1.55it/s, loss=0.449, lr=1] \nSteps: 86%|████████▌ | 1290/1500 [13:55<02:15, 1.55it/s, loss=0.449, lr=1]\nSteps: 86%|████████▌ | 1290/1500 [13:55<02:15, 1.55it/s, loss=0.16, lr=1] \nSteps: 86%|████████▌ | 1291/1500 [13:56<02:15, 1.55it/s, loss=0.16, lr=1]\nSteps: 86%|████████▌ | 1291/1500 [13:56<02:15, 1.55it/s, loss=0.23, lr=1]\nSteps: 86%|████████▌ | 1292/1500 [13:57<02:14, 1.55it/s, loss=0.23, lr=1]\nSteps: 86%|████████▌ | 1292/1500 [13:57<02:14, 1.55it/s, loss=0.0799, lr=1]\nSteps: 86%|████████▌ | 1293/1500 [13:57<02:13, 1.55it/s, loss=0.0799, lr=1]\nSteps: 86%|████████▌ | 1293/1500 [13:57<02:13, 1.55it/s, loss=0.152, lr=1] \nSteps: 86%|████████▋ | 1294/1500 [13:58<02:13, 1.55it/s, loss=0.152, lr=1]\nSteps: 86%|████████▋ | 1294/1500 [13:58<02:13, 1.55it/s, loss=0.0529, lr=1]\nSteps: 86%|████████▋ | 1295/1500 [13:59<02:12, 1.55it/s, loss=0.0529, lr=1]\nSteps: 86%|████████▋ | 1295/1500 [13:59<02:12, 1.55it/s, loss=0.154, lr=1] \nSteps: 86%|████████▋ | 1296/1500 [13:59<02:11, 1.55it/s, loss=0.154, lr=1]\nSteps: 86%|████████▋ | 1296/1500 [13:59<02:11, 1.55it/s, loss=0.12, lr=1] \nSteps: 86%|████████▋ | 1297/1500 [14:00<02:11, 1.54it/s, loss=0.12, lr=1]\nSteps: 86%|████████▋ | 1297/1500 [14:00<02:11, 1.54it/s, loss=0.105, lr=1]\nSteps: 87%|████████▋ | 1298/1500 [14:00<02:10, 1.54it/s, loss=0.105, lr=1]\nSteps: 87%|████████▋ | 1298/1500 [14:00<02:10, 1.54it/s, loss=0.0355, lr=1]\nSteps: 87%|████████▋ | 1299/1500 [14:01<02:09, 1.55it/s, loss=0.0355, lr=1]\nSteps: 87%|████████▋ | 1299/1500 [14:01<02:09, 1.55it/s, loss=0.15, lr=1] \nSteps: 87%|████████▋ | 1300/1500 [14:02<02:09, 1.55it/s, loss=0.15, lr=1]\nSteps: 87%|████████▋ | 1300/1500 [14:02<02:09, 1.55it/s, loss=0.148, lr=1]\nSteps: 87%|████████▋ | 1301/1500 [14:02<02:08, 1.55it/s, loss=0.148, lr=1]\nSteps: 87%|████████▋ | 1301/1500 [14:02<02:08, 1.55it/s, loss=0.186, lr=1]\nSteps: 87%|████████▋ | 1302/1500 [14:03<02:07, 1.55it/s, loss=0.186, lr=1]\nSteps: 87%|████████▋ | 1302/1500 [14:03<02:07, 1.55it/s, loss=0.0771, lr=1]\nSteps: 87%|████████▋ | 1303/1500 [14:04<02:07, 1.55it/s, loss=0.0771, lr=1]\nSteps: 87%|████████▋ | 1303/1500 [14:04<02:07, 1.55it/s, loss=0.161, lr=1] \nSteps: 87%|████████▋ | 1304/1500 [14:04<02:06, 1.55it/s, loss=0.161, lr=1]\nSteps: 87%|████████▋ | 1304/1500 [14:04<02:06, 1.55it/s, loss=0.104, lr=1]\nSteps: 87%|████████▋ | 1305/1500 [14:05<02:05, 1.55it/s, loss=0.104, lr=1]\nSteps: 87%|████████▋ | 1305/1500 [14:05<02:05, 1.55it/s, loss=0.112, lr=1]\nSteps: 87%|████████▋ | 1306/1500 [14:06<02:05, 1.55it/s, loss=0.112, lr=1]\nSteps: 87%|████████▋ | 1306/1500 [14:06<02:05, 1.55it/s, loss=0.25, lr=1] \nSteps: 87%|████████▋ | 1307/1500 [14:06<02:04, 1.55it/s, loss=0.25, lr=1]\nSteps: 87%|████████▋ | 1307/1500 [14:06<02:04, 1.55it/s, loss=0.0566, lr=1]\nSteps: 87%|████████▋ | 1308/1500 [14:07<02:04, 1.55it/s, loss=0.0566, lr=1]\nSteps: 87%|████████▋ | 1308/1500 [14:07<02:04, 1.55it/s, loss=0.164, lr=1] \nSteps: 87%|████████▋ | 1309/1500 [14:08<02:03, 1.55it/s, loss=0.164, lr=1]\nSteps: 87%|████████▋ | 1309/1500 [14:08<02:03, 1.55it/s, loss=0.165, lr=1]\nSteps: 87%|████████▋ | 1310/1500 [14:08<02:02, 1.55it/s, loss=0.165, lr=1]\nSteps: 87%|████████▋ | 1310/1500 [14:08<02:02, 1.55it/s, loss=0.15, lr=1] \nSteps: 87%|████████▋ | 1311/1500 [14:09<02:02, 1.55it/s, loss=0.15, lr=1]\nSteps: 87%|████████▋ | 1311/1500 [14:09<02:02, 1.55it/s, loss=0.136, lr=1]\nSteps: 87%|████████▋ | 1312/1500 [14:10<02:01, 1.55it/s, loss=0.136, lr=1]\nSteps: 87%|████████▋ | 1312/1500 [14:10<02:01, 1.55it/s, loss=0.104, lr=1]\nSteps: 88%|████████▊ | 1313/1500 [14:10<02:01, 1.54it/s, loss=0.104, lr=1]\nSteps: 88%|████████▊ | 1313/1500 [14:10<02:01, 1.54it/s, loss=0.103, lr=1]\nSteps: 88%|████████▊ | 1314/1500 [14:11<02:00, 1.54it/s, loss=0.103, lr=1]\nSteps: 88%|████████▊ | 1314/1500 [14:11<02:00, 1.54it/s, loss=0.0435, lr=1]\nSteps: 88%|████████▊ | 1315/1500 [14:11<01:59, 1.54it/s, loss=0.0435, lr=1]\nSteps: 88%|████████▊ | 1315/1500 [14:11<01:59, 1.54it/s, loss=0.133, lr=1] \nSteps: 88%|████████▊ | 1316/1500 [14:12<01:59, 1.55it/s, loss=0.133, lr=1]\nSteps: 88%|████████▊ | 1316/1500 [14:12<01:59, 1.55it/s, loss=0.177, lr=1]\nSteps: 88%|████████▊ | 1317/1500 [14:13<01:58, 1.55it/s, loss=0.177, lr=1]\nSteps: 88%|████████▊ | 1317/1500 [14:13<01:58, 1.55it/s, loss=0.112, lr=1]\nSteps: 88%|████████▊ | 1318/1500 [14:13<01:57, 1.55it/s, loss=0.112, lr=1]\nSteps: 88%|████████▊ | 1318/1500 [14:13<01:57, 1.55it/s, loss=0.113, lr=1]\nSteps: 88%|████████▊ | 1319/1500 [14:14<01:56, 1.55it/s, loss=0.113, lr=1]\nSteps: 88%|████████▊ | 1319/1500 [14:14<01:56, 1.55it/s, loss=0.0744, lr=1]\nSteps: 88%|████████▊ | 1320/1500 [14:15<01:56, 1.55it/s, loss=0.0744, lr=1]\nSteps: 88%|████████▊ | 1320/1500 [14:15<01:56, 1.55it/s, loss=0.0429, lr=1]\nSteps: 88%|████████▊ | 1321/1500 [14:15<01:55, 1.55it/s, loss=0.0429, lr=1]\nSteps: 88%|████████▊ | 1321/1500 [14:15<01:55, 1.55it/s, loss=0.108, lr=1] \nSteps: 88%|████████▊ | 1322/1500 [14:16<01:54, 1.55it/s, loss=0.108, lr=1]\nSteps: 88%|████████▊ | 1322/1500 [14:16<01:54, 1.55it/s, loss=0.11, lr=1] \nSteps: 88%|████████▊ | 1323/1500 [14:17<01:54, 1.55it/s, loss=0.11, lr=1]\nSteps: 88%|████████▊ | 1323/1500 [14:17<01:54, 1.55it/s, loss=0.107, lr=1]\nSteps: 88%|████████▊ | 1324/1500 [14:17<01:53, 1.55it/s, loss=0.107, lr=1]\nSteps: 88%|████████▊ | 1324/1500 [14:17<01:53, 1.55it/s, loss=0.113, lr=1]\nSteps: 88%|████████▊ | 1325/1500 [14:18<01:52, 1.55it/s, loss=0.113, lr=1]\nSteps: 88%|████████▊ | 1325/1500 [14:18<01:52, 1.55it/s, loss=0.0766, lr=1]\nSteps: 88%|████████▊ | 1326/1500 [14:19<01:52, 1.55it/s, loss=0.0766, lr=1]\nSteps: 88%|████████▊ | 1326/1500 [14:19<01:52, 1.55it/s, loss=0.0635, lr=1]\nSteps: 88%|████████▊ | 1327/1500 [14:19<01:51, 1.55it/s, loss=0.0635, lr=1]\nSteps: 88%|████████▊ | 1327/1500 [14:19<01:51, 1.55it/s, loss=0.117, lr=1] \nSteps: 89%|████████▊ | 1328/1500 [14:20<01:50, 1.55it/s, loss=0.117, lr=1]\nSteps: 89%|████████▊ | 1328/1500 [14:20<01:50, 1.55it/s, loss=0.149, lr=1]\nSteps: 89%|████████▊ | 1329/1500 [14:21<01:50, 1.54it/s, loss=0.149, lr=1]\nSteps: 89%|████████▊ | 1329/1500 [14:21<01:50, 1.54it/s, loss=0.307, lr=1]\nSteps: 89%|████████▊ | 1330/1500 [14:21<01:49, 1.55it/s, loss=0.307, lr=1]\nSteps: 89%|████████▊ | 1330/1500 [14:21<01:49, 1.55it/s, loss=0.189, lr=1]\nSteps: 89%|████████▊ | 1331/1500 [14:22<01:49, 1.55it/s, loss=0.189, lr=1]\nSteps: 89%|████████▊ | 1331/1500 [14:22<01:49, 1.55it/s, loss=0.0944, lr=1]\nSteps: 89%|████████▉ | 1332/1500 [14:22<01:48, 1.55it/s, loss=0.0944, lr=1]\nSteps: 89%|████████▉ | 1332/1500 [14:22<01:48, 1.55it/s, loss=0.122, lr=1] \nSteps: 89%|████████▉ | 1333/1500 [14:23<01:47, 1.55it/s, loss=0.122, lr=1]\nSteps: 89%|████████▉ | 1333/1500 [14:23<01:47, 1.55it/s, loss=0.0858, lr=1]\nSteps: 89%|████████▉ | 1334/1500 [14:24<01:47, 1.55it/s, loss=0.0858, lr=1]\nSteps: 89%|████████▉ | 1334/1500 [14:24<01:47, 1.55it/s, loss=0.127, lr=1] \nSteps: 89%|████████▉ | 1335/1500 [14:24<01:46, 1.55it/s, loss=0.127, lr=1]\nSteps: 89%|████████▉ | 1335/1500 [14:24<01:46, 1.55it/s, loss=0.134, lr=1]\nSteps: 89%|████████▉ | 1336/1500 [14:25<01:45, 1.55it/s, loss=0.134, lr=1]\nSteps: 89%|████████▉ | 1336/1500 [14:25<01:45, 1.55it/s, loss=0.162, lr=1]\nSteps: 89%|████████▉ | 1337/1500 [14:26<01:45, 1.55it/s, loss=0.162, lr=1]\nSteps: 89%|████████▉ | 1337/1500 [14:26<01:45, 1.55it/s, loss=0.037, lr=1]\nSteps: 89%|████████▉ | 1338/1500 [14:26<01:44, 1.55it/s, loss=0.037, lr=1]\nSteps: 89%|████████▉ | 1338/1500 [14:26<01:44, 1.55it/s, loss=0.0404, lr=1]\nSteps: 89%|████████▉ | 1339/1500 [14:27<01:44, 1.55it/s, loss=0.0404, lr=1]\nSteps: 89%|████████▉ | 1339/1500 [14:27<01:44, 1.55it/s, loss=0.127, lr=1] \nSteps: 89%|████████▉ | 1340/1500 [14:28<01:43, 1.55it/s, loss=0.127, lr=1]\nSteps: 89%|████████▉ | 1340/1500 [14:28<01:43, 1.55it/s, loss=0.16, lr=1] \nSteps: 89%|████████▉ | 1341/1500 [14:28<01:42, 1.55it/s, loss=0.16, lr=1]\nSteps: 89%|████████▉ | 1341/1500 [14:28<01:42, 1.55it/s, loss=0.265, lr=1]\nSteps: 89%|████████▉ | 1342/1500 [14:29<01:42, 1.55it/s, loss=0.265, lr=1]\nSteps: 89%|████████▉ | 1342/1500 [14:29<01:42, 1.55it/s, loss=0.0393, lr=1]\nSteps: 90%|████████▉ | 1343/1500 [14:30<01:41, 1.55it/s, loss=0.0393, lr=1]\nSteps: 90%|████████▉ | 1343/1500 [14:30<01:41, 1.55it/s, loss=0.0751, lr=1]\nSteps: 90%|████████▉ | 1344/1500 [14:30<01:40, 1.55it/s, loss=0.0751, lr=1]\nSteps: 90%|████████▉ | 1344/1500 [14:30<01:40, 1.55it/s, loss=0.0515, lr=1]\nSteps: 90%|████████▉ | 1345/1500 [14:31<01:46, 1.46it/s, loss=0.0515, lr=1]\nSteps: 90%|████████▉ | 1345/1500 [14:31<01:46, 1.46it/s, loss=0.115, lr=1] \nSteps: 90%|████████▉ | 1346/1500 [14:32<01:43, 1.48it/s, loss=0.115, lr=1]\nSteps: 90%|████████▉ | 1346/1500 [14:32<01:43, 1.48it/s, loss=0.114, lr=1]\nSteps: 90%|████████▉ | 1347/1500 [14:32<01:41, 1.50it/s, loss=0.114, lr=1]\nSteps: 90%|████████▉ | 1347/1500 [14:32<01:41, 1.50it/s, loss=0.0997, lr=1]\nSteps: 90%|████████▉ | 1348/1500 [14:33<01:40, 1.52it/s, loss=0.0997, lr=1]\nSteps: 90%|████████▉ | 1348/1500 [14:33<01:40, 1.52it/s, loss=0.105, lr=1] \nSteps: 90%|████████▉ | 1349/1500 [14:34<01:38, 1.53it/s, loss=0.105, lr=1]\nSteps: 90%|████████▉ | 1349/1500 [14:34<01:38, 1.53it/s, loss=0.125, lr=1]\nSteps: 90%|█████████ | 1350/1500 [14:34<01:37, 1.54it/s, loss=0.125, lr=1]\nSteps: 90%|█████████ | 1350/1500 [14:34<01:37, 1.54it/s, loss=0.126, lr=1]\nSteps: 90%|█████████ | 1351/1500 [14:35<01:36, 1.54it/s, loss=0.126, lr=1]\nSteps: 90%|█████████ | 1351/1500 [14:35<01:36, 1.54it/s, loss=0.0936, lr=1]\nSteps: 90%|█████████ | 1352/1500 [14:35<01:35, 1.54it/s, loss=0.0936, lr=1]\nSteps: 90%|█████████ | 1352/1500 [14:35<01:35, 1.54it/s, loss=0.242, lr=1] \nSteps: 90%|█████████ | 1353/1500 [14:36<01:35, 1.55it/s, loss=0.242, lr=1]\nSteps: 90%|█████████ | 1353/1500 [14:36<01:35, 1.55it/s, loss=0.117, lr=1]\nSteps: 90%|█████████ | 1354/1500 [14:37<01:34, 1.55it/s, loss=0.117, lr=1]\nSteps: 90%|█████████ | 1354/1500 [14:37<01:34, 1.55it/s, loss=0.173, lr=1]\nSteps: 90%|█████████ | 1355/1500 [14:37<01:33, 1.55it/s, loss=0.173, lr=1]\nSteps: 90%|█████████ | 1355/1500 [14:37<01:33, 1.55it/s, loss=0.149, lr=1]\nSteps: 90%|█████████ | 1356/1500 [14:38<01:32, 1.55it/s, loss=0.149, lr=1]\nSteps: 90%|█████████ | 1356/1500 [14:38<01:32, 1.55it/s, loss=0.0857, lr=1]\nSteps: 90%|█████████ | 1357/1500 [14:39<01:32, 1.55it/s, loss=0.0857, lr=1]\nSteps: 90%|█████████ | 1357/1500 [14:39<01:32, 1.55it/s, loss=0.0996, lr=1]\nSteps: 91%|█████████ | 1358/1500 [14:39<01:31, 1.55it/s, loss=0.0996, lr=1]\nSteps: 91%|█████████ | 1358/1500 [14:39<01:31, 1.55it/s, loss=0.147, lr=1] \nSteps: 91%|█████████ | 1359/1500 [14:40<01:30, 1.55it/s, loss=0.147, lr=1]\nSteps: 91%|█████████ | 1359/1500 [14:40<01:30, 1.55it/s, loss=0.17, lr=1] \nSteps: 91%|█████████ | 1360/1500 [14:41<01:30, 1.55it/s, loss=0.17, lr=1]\nSteps: 91%|█████████ | 1360/1500 [14:41<01:30, 1.55it/s, loss=0.113, lr=1]\nSteps: 91%|█████████ | 1361/1500 [14:41<01:30, 1.54it/s, loss=0.113, lr=1]\nSteps: 91%|█████████ | 1361/1500 [14:41<01:30, 1.54it/s, loss=0.0778, lr=1]\nSteps: 91%|█████████ | 1362/1500 [14:42<01:29, 1.54it/s, loss=0.0778, lr=1]\nSteps: 91%|█████████ | 1362/1500 [14:42<01:29, 1.54it/s, loss=0.0827, lr=1]\nSteps: 91%|█████████ | 1363/1500 [14:43<01:28, 1.55it/s, loss=0.0827, lr=1]\nSteps: 91%|█████████ | 1363/1500 [14:43<01:28, 1.55it/s, loss=0.0688, lr=1]\nSteps: 91%|█████████ | 1364/1500 [14:43<01:27, 1.55it/s, loss=0.0688, lr=1]\nSteps: 91%|█████████ | 1364/1500 [14:43<01:27, 1.55it/s, loss=0.101, lr=1] \nSteps: 91%|█████████ | 1365/1500 [14:44<01:27, 1.55it/s, loss=0.101, lr=1]\nSteps: 91%|█████████ | 1365/1500 [14:44<01:27, 1.55it/s, loss=0.147, lr=1]\nSteps: 91%|█████████ | 1366/1500 [14:45<01:26, 1.55it/s, loss=0.147, lr=1]\nSteps: 91%|█████████ | 1366/1500 [14:45<01:26, 1.55it/s, loss=0.0731, lr=1]\nSteps: 91%|█████████ | 1367/1500 [14:45<01:25, 1.55it/s, loss=0.0731, lr=1]\nSteps: 91%|█████████ | 1367/1500 [14:45<01:25, 1.55it/s, loss=0.164, lr=1] \nSteps: 91%|█████████ | 1368/1500 [14:46<01:25, 1.55it/s, loss=0.164, lr=1]\nSteps: 91%|█████████ | 1368/1500 [14:46<01:25, 1.55it/s, loss=0.0986, lr=1]\nSteps: 91%|█████████▏| 1369/1500 [14:46<01:24, 1.55it/s, loss=0.0986, lr=1]\nSteps: 91%|█████████▏| 1369/1500 [14:46<01:24, 1.55it/s, loss=0.0539, lr=1]\nSteps: 91%|█████████▏| 1370/1500 [14:47<01:23, 1.55it/s, loss=0.0539, lr=1]\nSteps: 91%|█████████▏| 1370/1500 [14:47<01:23, 1.55it/s, loss=0.106, lr=1] \nSteps: 91%|█████████▏| 1371/1500 [14:48<01:23, 1.55it/s, loss=0.106, lr=1]\nSteps: 91%|█████████▏| 1371/1500 [14:48<01:23, 1.55it/s, loss=0.069, lr=1]\nSteps: 91%|█████████▏| 1372/1500 [14:48<01:22, 1.55it/s, loss=0.069, lr=1]\nSteps: 91%|█████████▏| 1372/1500 [14:48<01:22, 1.55it/s, loss=0.0801, lr=1]\nSteps: 92%|█████████▏| 1373/1500 [14:49<01:21, 1.55it/s, loss=0.0801, lr=1]\nSteps: 92%|█████████▏| 1373/1500 [14:49<01:21, 1.55it/s, loss=0.258, lr=1] \nSteps: 92%|█████████▏| 1374/1500 [14:50<01:21, 1.55it/s, loss=0.258, lr=1]\nSteps: 92%|█████████▏| 1374/1500 [14:50<01:21, 1.55it/s, loss=0.0993, lr=1]\nSteps: 92%|█████████▏| 1375/1500 [14:50<01:20, 1.55it/s, loss=0.0993, lr=1]\nSteps: 92%|█████████▏| 1375/1500 [14:50<01:20, 1.55it/s, loss=0.114, lr=1] \nSteps: 92%|█████████▏| 1376/1500 [14:51<01:19, 1.55it/s, loss=0.114, lr=1]\nSteps: 92%|█████████▏| 1376/1500 [14:51<01:19, 1.55it/s, loss=0.15, lr=1] \nSteps: 92%|█████████▏| 1377/1500 [14:52<01:19, 1.54it/s, loss=0.15, lr=1]\nSteps: 92%|█████████▏| 1377/1500 [14:52<01:19, 1.54it/s, loss=0.149, lr=1]\nSteps: 92%|█████████▏| 1378/1500 [14:52<01:18, 1.55it/s, loss=0.149, lr=1]\nSteps: 92%|█████████▏| 1378/1500 [14:52<01:18, 1.55it/s, loss=0.106, lr=1]\nSteps: 92%|█████████▏| 1379/1500 [14:53<01:18, 1.55it/s, loss=0.106, lr=1]\nSteps: 92%|█████████▏| 1379/1500 [14:53<01:18, 1.55it/s, loss=0.0788, lr=1]\nSteps: 92%|█████████▏| 1380/1500 [14:54<01:17, 1.55it/s, loss=0.0788, lr=1]\nSteps: 92%|█████████▏| 1380/1500 [14:54<01:17, 1.55it/s, loss=0.157, lr=1] \nSteps: 92%|█████████▏| 1381/1500 [14:54<01:16, 1.55it/s, loss=0.157, lr=1]\nSteps: 92%|█████████▏| 1381/1500 [14:54<01:16, 1.55it/s, loss=0.101, lr=1]\nSteps: 92%|█████████▏| 1382/1500 [14:55<01:16, 1.55it/s, loss=0.101, lr=1]\nSteps: 92%|█████████▏| 1382/1500 [14:55<01:16, 1.55it/s, loss=0.0446, lr=1]\nSteps: 92%|█████████▏| 1383/1500 [14:55<01:15, 1.55it/s, loss=0.0446, lr=1]\nSteps: 92%|█████████▏| 1383/1500 [14:55<01:15, 1.55it/s, loss=0.107, lr=1] \nSteps: 92%|█████████▏| 1384/1500 [14:56<01:14, 1.55it/s, loss=0.107, lr=1]\nSteps: 92%|█████████▏| 1384/1500 [14:56<01:14, 1.55it/s, loss=0.144, lr=1]\nSteps: 92%|█████████▏| 1385/1500 [14:57<01:14, 1.55it/s, loss=0.144, lr=1]\nSteps: 92%|█████████▏| 1385/1500 [14:57<01:14, 1.55it/s, loss=0.0751, lr=1]\nSteps: 92%|█████████▏| 1386/1500 [14:57<01:13, 1.55it/s, loss=0.0751, lr=1]\nSteps: 92%|█████████▏| 1386/1500 [14:57<01:13, 1.55it/s, loss=0.0852, lr=1]\nSteps: 92%|█████████▏| 1387/1500 [14:58<01:12, 1.55it/s, loss=0.0852, lr=1]\nSteps: 92%|█████████▏| 1387/1500 [14:58<01:12, 1.55it/s, loss=0.119, lr=1] \nSteps: 93%|█████████▎| 1388/1500 [14:59<01:12, 1.55it/s, loss=0.119, lr=1]\nSteps: 93%|█████████▎| 1388/1500 [14:59<01:12, 1.55it/s, loss=0.205, lr=1]\nSteps: 93%|█████████▎| 1389/1500 [14:59<01:11, 1.55it/s, loss=0.205, lr=1]\nSteps: 93%|█████████▎| 1389/1500 [14:59<01:11, 1.55it/s, loss=0.112, lr=1]\nSteps: 93%|█████████▎| 1390/1500 [15:00<01:10, 1.55it/s, loss=0.112, lr=1]\nSteps: 93%|█████████▎| 1390/1500 [15:00<01:10, 1.55it/s, loss=0.203, lr=1]\nSteps: 93%|█████████▎| 1391/1500 [15:01<01:10, 1.55it/s, loss=0.203, lr=1]\nSteps: 93%|█████████▎| 1391/1500 [15:01<01:10, 1.55it/s, loss=0.157, lr=1]\nSteps: 93%|█████████▎| 1392/1500 [15:01<01:09, 1.55it/s, loss=0.157, lr=1]\nSteps: 93%|█████████▎| 1392/1500 [15:01<01:09, 1.55it/s, loss=0.102, lr=1]\nSteps: 93%|█████████▎| 1393/1500 [15:02<01:09, 1.54it/s, loss=0.102, lr=1]\nSteps: 93%|█████████▎| 1393/1500 [15:02<01:09, 1.54it/s, loss=0.0599, lr=1]\nSteps: 93%|█████████▎| 1394/1500 [15:03<01:08, 1.55it/s, loss=0.0599, lr=1]\nSteps: 93%|█████████▎| 1394/1500 [15:03<01:08, 1.55it/s, loss=0.0734, lr=1]\nSteps: 93%|█████████▎| 1395/1500 [15:03<01:07, 1.55it/s, loss=0.0734, lr=1]\nSteps: 93%|█████████▎| 1395/1500 [15:03<01:07, 1.55it/s, loss=0.151, lr=1] \nSteps: 93%|█████████▎| 1396/1500 [15:04<01:07, 1.54it/s, loss=0.151, lr=1]\nSteps: 93%|█████████▎| 1396/1500 [15:04<01:07, 1.54it/s, loss=0.17, lr=1] \nSteps: 93%|█████████▎| 1397/1500 [15:05<01:06, 1.55it/s, loss=0.17, lr=1]\nSteps: 93%|█████████▎| 1397/1500 [15:05<01:06, 1.55it/s, loss=0.0297, lr=1]\nSteps: 93%|█████████▎| 1398/1500 [15:05<01:05, 1.55it/s, loss=0.0297, lr=1]\nSteps: 93%|█████████▎| 1398/1500 [15:05<01:05, 1.55it/s, loss=0.125, lr=1] \nSteps: 93%|█████████▎| 1399/1500 [15:06<01:05, 1.55it/s, loss=0.125, lr=1]\nSteps: 93%|█████████▎| 1399/1500 [15:06<01:05, 1.55it/s, loss=0.151, lr=1]\nSteps: 93%|█████████▎| 1400/1500 [15:06<01:04, 1.55it/s, loss=0.151, lr=1]\nSteps: 93%|█████████▎| 1400/1500 [15:06<01:04, 1.55it/s, loss=0.06, lr=1] \nSteps: 93%|█████████▎| 1401/1500 [15:07<01:03, 1.55it/s, loss=0.06, lr=1]\nSteps: 93%|█████████▎| 1401/1500 [15:07<01:03, 1.55it/s, loss=0.074, lr=1]\nSteps: 93%|█████████▎| 1402/1500 [15:08<01:03, 1.55it/s, loss=0.074, lr=1]\nSteps: 93%|█████████▎| 1402/1500 [15:08<01:03, 1.55it/s, loss=0.181, lr=1]\nSteps: 94%|█████████▎| 1403/1500 [15:08<01:02, 1.55it/s, loss=0.181, lr=1]\nSteps: 94%|█████████▎| 1403/1500 [15:08<01:02, 1.55it/s, loss=0.121, lr=1]\nSteps: 94%|█████████▎| 1404/1500 [15:09<01:01, 1.55it/s, loss=0.121, lr=1]\nSteps: 94%|█████████▎| 1404/1500 [15:09<01:01, 1.55it/s, loss=0.106, lr=1]\nSteps: 94%|█████████▎| 1405/1500 [15:10<01:01, 1.55it/s, loss=0.106, lr=1]\nSteps: 94%|█████████▎| 1405/1500 [15:10<01:01, 1.55it/s, loss=0.147, lr=1]\nSteps: 94%|█████████▎| 1406/1500 [15:10<01:00, 1.55it/s, loss=0.147, lr=1]\nSteps: 94%|█████████▎| 1406/1500 [15:10<01:00, 1.55it/s, loss=0.19, lr=1] \nSteps: 94%|█████████▍| 1407/1500 [15:11<00:59, 1.55it/s, loss=0.19, lr=1]\nSteps: 94%|█████████▍| 1407/1500 [15:11<00:59, 1.55it/s, loss=0.0706, lr=1]\nSteps: 94%|█████████▍| 1408/1500 [15:12<00:59, 1.55it/s, loss=0.0706, lr=1]\nSteps: 94%|█████████▍| 1408/1500 [15:12<00:59, 1.55it/s, loss=0.193, lr=1] \nSteps: 94%|█████████▍| 1409/1500 [15:12<00:59, 1.54it/s, loss=0.193, lr=1]\nSteps: 94%|█████████▍| 1409/1500 [15:12<00:59, 1.54it/s, loss=0.204, lr=1]\nSteps: 94%|█████████▍| 1410/1500 [15:13<00:58, 1.54it/s, loss=0.204, lr=1]\nSteps: 94%|█████████▍| 1410/1500 [15:13<00:58, 1.54it/s, loss=0.0766, lr=1]\nSteps: 94%|█████████▍| 1411/1500 [15:14<00:57, 1.55it/s, loss=0.0766, lr=1]\nSteps: 94%|█████████▍| 1411/1500 [15:14<00:57, 1.55it/s, loss=0.113, lr=1] \nSteps: 94%|█████████▍| 1412/1500 [15:14<00:56, 1.55it/s, loss=0.113, lr=1]\nSteps: 94%|█████████▍| 1412/1500 [15:14<00:56, 1.55it/s, loss=0.032, lr=1]\nSteps: 94%|█████████▍| 1413/1500 [15:15<00:56, 1.55it/s, loss=0.032, lr=1]\nSteps: 94%|█████████▍| 1413/1500 [15:15<00:56, 1.55it/s, loss=0.104, lr=1]\nSteps: 94%|█████████▍| 1414/1500 [15:15<00:55, 1.55it/s, loss=0.104, lr=1]\nSteps: 94%|█████████▍| 1414/1500 [15:15<00:55, 1.55it/s, loss=0.109, lr=1]\nSteps: 94%|█████████▍| 1415/1500 [15:16<00:54, 1.55it/s, loss=0.109, lr=1]\nSteps: 94%|█████████▍| 1415/1500 [15:16<00:54, 1.55it/s, loss=0.187, lr=1]\nSteps: 94%|█████████▍| 1416/1500 [15:17<00:54, 1.55it/s, loss=0.187, lr=1]\nSteps: 94%|█████████▍| 1416/1500 [15:17<00:54, 1.55it/s, loss=0.128, lr=1]\nSteps: 94%|█████████▍| 1417/1500 [15:17<00:53, 1.55it/s, loss=0.128, lr=1]\nSteps: 94%|█████████▍| 1417/1500 [15:17<00:53, 1.55it/s, loss=0.169, lr=1]\nSteps: 95%|█████████▍| 1418/1500 [15:18<00:52, 1.55it/s, loss=0.169, lr=1]\nSteps: 95%|█████████▍| 1418/1500 [15:18<00:52, 1.55it/s, loss=0.127, lr=1]\nSteps: 95%|█████████▍| 1419/1500 [15:19<00:52, 1.55it/s, loss=0.127, lr=1]\nSteps: 95%|█████████▍| 1419/1500 [15:19<00:52, 1.55it/s, loss=0.0858, lr=1]\nSteps: 95%|█████████▍| 1420/1500 [15:19<00:51, 1.55it/s, loss=0.0858, lr=1]\nSteps: 95%|█████████▍| 1420/1500 [15:19<00:51, 1.55it/s, loss=0.113, lr=1] \nSteps: 95%|█████████▍| 1421/1500 [15:20<00:50, 1.55it/s, loss=0.113, lr=1]\nSteps: 95%|█████████▍| 1421/1500 [15:20<00:50, 1.55it/s, loss=0.107, lr=1]\nSteps: 95%|█████████▍| 1422/1500 [15:21<00:50, 1.55it/s, loss=0.107, lr=1]\nSteps: 95%|█████████▍| 1422/1500 [15:21<00:50, 1.55it/s, loss=0.0445, lr=1]\nSteps: 95%|█████████▍| 1423/1500 [15:21<00:49, 1.55it/s, loss=0.0445, lr=1]\nSteps: 95%|█████████▍| 1423/1500 [15:21<00:49, 1.55it/s, loss=0.0858, lr=1]\nSteps: 95%|█████████▍| 1424/1500 [15:22<00:49, 1.55it/s, loss=0.0858, lr=1]\nSteps: 95%|█████████▍| 1424/1500 [15:22<00:49, 1.55it/s, loss=0.14, lr=1] \nSteps: 95%|█████████▌| 1425/1500 [15:23<00:48, 1.54it/s, loss=0.14, lr=1]\nSteps: 95%|█████████▌| 1425/1500 [15:23<00:48, 1.54it/s, loss=0.0467, lr=1]\nSteps: 95%|█████████▌| 1426/1500 [15:23<00:47, 1.54it/s, loss=0.0467, lr=1]\nSteps: 95%|█████████▌| 1426/1500 [15:23<00:47, 1.54it/s, loss=0.211, lr=1] \nSteps: 95%|█████████▌| 1427/1500 [15:24<00:47, 1.54it/s, loss=0.211, lr=1]\nSteps: 95%|█████████▌| 1427/1500 [15:24<00:47, 1.54it/s, loss=0.0604, lr=1]\nSteps: 95%|█████████▌| 1428/1500 [15:25<00:46, 1.54it/s, loss=0.0604, lr=1]\nSteps: 95%|█████████▌| 1428/1500 [15:25<00:46, 1.54it/s, loss=0.1, lr=1] \nSteps: 95%|█████████▌| 1429/1500 [15:25<00:45, 1.55it/s, loss=0.1, lr=1]\nSteps: 95%|█████████▌| 1429/1500 [15:25<00:45, 1.55it/s, loss=0.127, lr=1]\nSteps: 95%|█████████▌| 1430/1500 [15:26<00:45, 1.55it/s, loss=0.127, lr=1]\nSteps: 95%|█████████▌| 1430/1500 [15:26<00:45, 1.55it/s, loss=0.132, lr=1]\nSteps: 95%|█████████▌| 1431/1500 [15:26<00:44, 1.55it/s, loss=0.132, lr=1]\nSteps: 95%|█████████▌| 1431/1500 [15:26<00:44, 1.55it/s, loss=0.19, lr=1] \nSteps: 95%|█████████▌| 1432/1500 [15:27<00:43, 1.55it/s, loss=0.19, lr=1]\nSteps: 95%|█████████▌| 1432/1500 [15:27<00:43, 1.55it/s, loss=0.0851, lr=1]\nSteps: 96%|█████████▌| 1433/1500 [15:28<00:43, 1.55it/s, loss=0.0851, lr=1]\nSteps: 96%|█████████▌| 1433/1500 [15:28<00:43, 1.55it/s, loss=0.145, lr=1] \nSteps: 96%|█████████▌| 1434/1500 [15:28<00:42, 1.55it/s, loss=0.145, lr=1]\nSteps: 96%|█████████▌| 1434/1500 [15:28<00:42, 1.55it/s, loss=0.296, lr=1]\nSteps: 96%|█████████▌| 1435/1500 [15:29<00:41, 1.55it/s, loss=0.296, lr=1]\nSteps: 96%|█████████▌| 1435/1500 [15:29<00:41, 1.55it/s, loss=0.0936, lr=1]\nSteps: 96%|█████████▌| 1436/1500 [15:30<00:41, 1.55it/s, loss=0.0936, lr=1]\nSteps: 96%|█████████▌| 1436/1500 [15:30<00:41, 1.55it/s, loss=0.0323, lr=1]\nSteps: 96%|█████████▌| 1437/1500 [15:30<00:40, 1.55it/s, loss=0.0323, lr=1]\nSteps: 96%|█████████▌| 1437/1500 [15:30<00:40, 1.55it/s, loss=0.114, lr=1] \nSteps: 96%|█████████▌| 1438/1500 [15:31<00:40, 1.55it/s, loss=0.114, lr=1]\nSteps: 96%|█████████▌| 1438/1500 [15:31<00:40, 1.55it/s, loss=0.0814, lr=1]\nSteps: 96%|█████████▌| 1439/1500 [15:32<00:39, 1.55it/s, loss=0.0814, lr=1]\nSteps: 96%|█████████▌| 1439/1500 [15:32<00:39, 1.55it/s, loss=0.103, lr=1] \nSteps: 96%|█████████▌| 1440/1500 [15:32<00:38, 1.55it/s, loss=0.103, lr=1]\nSteps: 96%|█████████▌| 1440/1500 [15:32<00:38, 1.55it/s, loss=0.123, lr=1]\nSteps: 96%|█████████▌| 1441/1500 [15:33<00:38, 1.54it/s, loss=0.123, lr=1]\nSteps: 96%|█████████▌| 1441/1500 [15:33<00:38, 1.54it/s, loss=0.064, lr=1]\nSteps: 96%|█████████▌| 1442/1500 [15:34<00:37, 1.54it/s, loss=0.064, lr=1]\nSteps: 96%|█████████▌| 1442/1500 [15:34<00:37, 1.54it/s, loss=0.148, lr=1]\nSteps: 96%|█████████▌| 1443/1500 [15:34<00:36, 1.54it/s, loss=0.148, lr=1]\nSteps: 96%|█████████▌| 1443/1500 [15:34<00:36, 1.54it/s, loss=0.154, lr=1]\nSteps: 96%|█████████▋| 1444/1500 [15:35<00:36, 1.54it/s, loss=0.154, lr=1]\nSteps: 96%|█████████▋| 1444/1500 [15:35<00:36, 1.54it/s, loss=0.0929, lr=1]\nSteps: 96%|█████████▋| 1445/1500 [15:36<00:35, 1.55it/s, loss=0.0929, lr=1]\nSteps: 96%|█████████▋| 1445/1500 [15:36<00:35, 1.55it/s, loss=0.124, lr=1] \nSteps: 96%|█████████▋| 1446/1500 [15:36<00:34, 1.55it/s, loss=0.124, lr=1]\nSteps: 96%|█████████▋| 1446/1500 [15:36<00:34, 1.55it/s, loss=0.0775, lr=1]\nSteps: 96%|█████████▋| 1447/1500 [15:37<00:34, 1.55it/s, loss=0.0775, lr=1]\nSteps: 96%|█████████▋| 1447/1500 [15:37<00:34, 1.55it/s, loss=0.0934, lr=1]\nSteps: 97%|█████████▋| 1448/1500 [15:37<00:33, 1.55it/s, loss=0.0934, lr=1]\nSteps: 97%|█████████▋| 1448/1500 [15:37<00:33, 1.55it/s, loss=0.121, lr=1] \nSteps: 97%|█████████▋| 1449/1500 [15:38<00:32, 1.55it/s, loss=0.121, lr=1]\nSteps: 97%|█████████▋| 1449/1500 [15:38<00:32, 1.55it/s, loss=0.186, lr=1]\nSteps: 97%|█████████▋| 1450/1500 [15:39<00:32, 1.55it/s, loss=0.186, lr=1]\nSteps: 97%|█████████▋| 1450/1500 [15:39<00:32, 1.55it/s, loss=0.106, lr=1]\nSteps: 97%|█████████▋| 1451/1500 [15:39<00:31, 1.55it/s, loss=0.106, lr=1]\nSteps: 97%|█████████▋| 1451/1500 [15:39<00:31, 1.55it/s, loss=0.11, lr=1] \nSteps: 97%|█████████▋| 1452/1500 [15:40<00:30, 1.56it/s, loss=0.11, lr=1]\nSteps: 97%|█████████▋| 1452/1500 [15:40<00:30, 1.56it/s, loss=0.0888, lr=1]\nSteps: 97%|█████████▋| 1453/1500 [15:41<00:30, 1.55it/s, loss=0.0888, lr=1]\nSteps: 97%|█████████▋| 1453/1500 [15:41<00:30, 1.55it/s, loss=0.0609, lr=1]\nSteps: 97%|█████████▋| 1454/1500 [15:41<00:29, 1.55it/s, loss=0.0609, lr=1]\nSteps: 97%|█████████▋| 1454/1500 [15:41<00:29, 1.55it/s, loss=0.114, lr=1] \nSteps: 97%|█████████▋| 1455/1500 [15:42<00:28, 1.55it/s, loss=0.114, lr=1]\nSteps: 97%|█████████▋| 1455/1500 [15:42<00:28, 1.55it/s, loss=0.131, lr=1]\nSteps: 97%|█████████▋| 1456/1500 [15:43<00:28, 1.55it/s, loss=0.131, lr=1]\nSteps: 97%|█████████▋| 1456/1500 [15:43<00:28, 1.55it/s, loss=0.123, lr=1]\nSteps: 97%|█████████▋| 1457/1500 [15:43<00:27, 1.54it/s, loss=0.123, lr=1]\nSteps: 97%|█████████▋| 1457/1500 [15:43<00:27, 1.54it/s, loss=0.147, lr=1]\nSteps: 97%|█████████▋| 1458/1500 [15:44<00:27, 1.54it/s, loss=0.147, lr=1]\nSteps: 97%|█████████▋| 1458/1500 [15:44<00:27, 1.54it/s, loss=0.0953, lr=1]\nSteps: 97%|█████████▋| 1459/1500 [15:45<00:26, 1.54it/s, loss=0.0953, lr=1]\nSteps: 97%|█████████▋| 1459/1500 [15:45<00:26, 1.54it/s, loss=0.03, lr=1] \nSteps: 97%|█████████▋| 1460/1500 [15:45<00:25, 1.55it/s, loss=0.03, lr=1]\nSteps: 97%|█████████▋| 1460/1500 [15:45<00:25, 1.55it/s, loss=0.136, lr=1]\nSteps: 97%|█████████▋| 1461/1500 [15:46<00:25, 1.55it/s, loss=0.136, lr=1]\nSteps: 97%|█████████▋| 1461/1500 [15:46<00:25, 1.55it/s, loss=0.206, lr=1]\nSteps: 97%|█████████▋| 1462/1500 [15:46<00:24, 1.55it/s, loss=0.206, lr=1]\nSteps: 97%|█████████▋| 1462/1500 [15:47<00:24, 1.55it/s, loss=0.0827, lr=1]\nSteps: 98%|█████████▊| 1463/1500 [15:47<00:23, 1.55it/s, loss=0.0827, lr=1]\nSteps: 98%|█████████▊| 1463/1500 [15:47<00:23, 1.55it/s, loss=0.124, lr=1] \nSteps: 98%|█████████▊| 1464/1500 [15:48<00:23, 1.55it/s, loss=0.124, lr=1]\nSteps: 98%|█████████▊| 1464/1500 [15:48<00:23, 1.55it/s, loss=0.083, lr=1]\nSteps: 98%|█████████▊| 1465/1500 [15:48<00:22, 1.55it/s, loss=0.083, lr=1]\nSteps: 98%|█████████▊| 1465/1500 [15:48<00:22, 1.55it/s, loss=0.141, lr=1]\nSteps: 98%|█████████▊| 1466/1500 [15:49<00:21, 1.55it/s, loss=0.141, lr=1]\nSteps: 98%|█████████▊| 1466/1500 [15:49<00:21, 1.55it/s, loss=0.118, lr=1]\nSteps: 98%|█████████▊| 1467/1500 [15:50<00:21, 1.55it/s, loss=0.118, lr=1]\nSteps: 98%|█████████▊| 1467/1500 [15:50<00:21, 1.55it/s, loss=0.105, lr=1]\nSteps: 98%|█████████▊| 1468/1500 [15:50<00:20, 1.55it/s, loss=0.105, lr=1]\nSteps: 98%|█████████▊| 1468/1500 [15:50<00:20, 1.55it/s, loss=0.129, lr=1]\nSteps: 98%|█████████▊| 1469/1500 [15:51<00:19, 1.55it/s, loss=0.129, lr=1]\nSteps: 98%|█████████▊| 1469/1500 [15:51<00:19, 1.55it/s, loss=0.12, lr=1] \nSteps: 98%|█████████▊| 1470/1500 [15:52<00:19, 1.55it/s, loss=0.12, lr=1]\nSteps: 98%|█████████▊| 1470/1500 [15:52<00:19, 1.55it/s, loss=0.19, lr=1]\nSteps: 98%|█████████▊| 1471/1500 [15:52<00:18, 1.55it/s, loss=0.19, lr=1]\nSteps: 98%|█████████▊| 1471/1500 [15:52<00:18, 1.55it/s, loss=0.154, lr=1]\nSteps: 98%|█████████▊| 1472/1500 [15:53<00:18, 1.55it/s, loss=0.154, lr=1]\nSteps: 98%|█████████▊| 1472/1500 [15:53<00:18, 1.55it/s, loss=0.0808, lr=1]\nSteps: 98%|█████████▊| 1473/1500 [15:54<00:17, 1.54it/s, loss=0.0808, lr=1]\nSteps: 98%|█████████▊| 1473/1500 [15:54<00:17, 1.54it/s, loss=0.128, lr=1] \nSteps: 98%|█████████▊| 1474/1500 [15:54<00:16, 1.55it/s, loss=0.128, lr=1]\nSteps: 98%|█████████▊| 1474/1500 [15:54<00:16, 1.55it/s, loss=0.126, lr=1]\nSteps: 98%|█████████▊| 1475/1500 [15:55<00:16, 1.55it/s, loss=0.126, lr=1]\nSteps: 98%|█████████▊| 1475/1500 [15:55<00:16, 1.55it/s, loss=0.0942, lr=1]\nSteps: 98%|█████████▊| 1476/1500 [15:56<00:15, 1.55it/s, loss=0.0942, lr=1]\nSteps: 98%|█████████▊| 1476/1500 [15:56<00:15, 1.55it/s, loss=0.167, lr=1] \nSteps: 98%|█████████▊| 1477/1500 [15:56<00:14, 1.55it/s, loss=0.167, lr=1]\nSteps: 98%|█████████▊| 1477/1500 [15:56<00:14, 1.55it/s, loss=0.11, lr=1] \nSteps: 99%|█████████▊| 1478/1500 [15:57<00:14, 1.55it/s, loss=0.11, lr=1]\nSteps: 99%|█████████▊| 1478/1500 [15:57<00:14, 1.55it/s, loss=0.176, lr=1]\nSteps: 99%|█████████▊| 1479/1500 [15:57<00:13, 1.55it/s, loss=0.176, lr=1]\nSteps: 99%|█████████▊| 1479/1500 [15:57<00:13, 1.55it/s, loss=0.0344, lr=1]\nSteps: 99%|█████████▊| 1480/1500 [15:58<00:12, 1.55it/s, loss=0.0344, lr=1]\nSteps: 99%|█████████▊| 1480/1500 [15:58<00:12, 1.55it/s, loss=0.181, lr=1] \nSteps: 99%|█████████▊| 1481/1500 [15:59<00:12, 1.55it/s, loss=0.181, lr=1]\nSteps: 99%|█████████▊| 1481/1500 [15:59<00:12, 1.55it/s, loss=0.103, lr=1]\nSteps: 99%|█████████▉| 1482/1500 [15:59<00:11, 1.55it/s, loss=0.103, lr=1]\nSteps: 99%|█████████▉| 1482/1500 [15:59<00:11, 1.55it/s, loss=0.27, lr=1] \nSteps: 99%|█████████▉| 1483/1500 [16:00<00:10, 1.55it/s, loss=0.27, lr=1]\nSteps: 99%|█████████▉| 1483/1500 [16:00<00:10, 1.55it/s, loss=0.154, lr=1]\nSteps: 99%|█████████▉| 1484/1500 [16:01<00:10, 1.55it/s, loss=0.154, lr=1]\nSteps: 99%|█████████▉| 1484/1500 [16:01<00:10, 1.55it/s, loss=0.154, lr=1]\nSteps: 99%|█████████▉| 1485/1500 [16:01<00:09, 1.55it/s, loss=0.154, lr=1]\nSteps: 99%|█████████▉| 1485/1500 [16:01<00:09, 1.55it/s, loss=0.107, lr=1]\nSteps: 99%|█████████▉| 1486/1500 [16:02<00:09, 1.55it/s, loss=0.107, lr=1]\nSteps: 99%|█████████▉| 1486/1500 [16:02<00:09, 1.55it/s, loss=0.304, lr=1]\nSteps: 99%|█████████▉| 1487/1500 [16:03<00:08, 1.55it/s, loss=0.304, lr=1]\nSteps: 99%|█████████▉| 1487/1500 [16:03<00:08, 1.55it/s, loss=0.218, lr=1]\nSteps: 99%|█████████▉| 1488/1500 [16:03<00:07, 1.55it/s, loss=0.218, lr=1]\nSteps: 99%|█████████▉| 1488/1500 [16:03<00:07, 1.55it/s, loss=0.137, lr=1]\nSteps: 99%|█████████▉| 1489/1500 [16:04<00:07, 1.53it/s, loss=0.137, lr=1]\nSteps: 99%|█████████▉| 1489/1500 [16:04<00:07, 1.53it/s, loss=0.0592, lr=1]\nSteps: 99%|█████████▉| 1490/1500 [16:05<00:06, 1.54it/s, loss=0.0592, lr=1]\nSteps: 99%|█████████▉| 1490/1500 [16:05<00:06, 1.54it/s, loss=0.0488, lr=1]\nSteps: 99%|█████████▉| 1491/1500 [16:05<00:05, 1.54it/s, loss=0.0488, lr=1]\nSteps: 99%|█████████▉| 1491/1500 [16:05<00:05, 1.54it/s, loss=0.0964, lr=1]\nSteps: 99%|█████████▉| 1492/1500 [16:06<00:05, 1.54it/s, loss=0.0964, lr=1]\nSteps: 99%|█████████▉| 1492/1500 [16:06<00:05, 1.54it/s, loss=0.178, lr=1] \nSteps: 100%|█████████▉| 1493/1500 [16:07<00:04, 1.54it/s, loss=0.178, lr=1]\nSteps: 100%|█████████▉| 1493/1500 [16:07<00:04, 1.54it/s, loss=0.259, lr=1]\nSteps: 100%|█████████▉| 1494/1500 [16:07<00:03, 1.55it/s, loss=0.259, lr=1]\nSteps: 100%|█████████▉| 1494/1500 [16:07<00:03, 1.55it/s, loss=0.0837, lr=1]\nSteps: 100%|█████████▉| 1495/1500 [16:08<00:03, 1.55it/s, loss=0.0837, lr=1]\nSteps: 100%|█████████▉| 1495/1500 [16:08<00:03, 1.55it/s, loss=0.198, lr=1] \nSteps: 100%|█████████▉| 1496/1500 [16:08<00:02, 1.55it/s, loss=0.198, lr=1]\nSteps: 100%|█████████▉| 1496/1500 [16:08<00:02, 1.55it/s, loss=0.146, lr=1]\nSteps: 100%|█████████▉| 1497/1500 [16:09<00:01, 1.55it/s, loss=0.146, lr=1]\nSteps: 100%|█████████▉| 1497/1500 [16:09<00:01, 1.55it/s, loss=0.186, lr=1]\nSteps: 100%|█████████▉| 1498/1500 [16:10<00:01, 1.55it/s, loss=0.186, lr=1]\nSteps: 100%|█████████▉| 1498/1500 [16:10<00:01, 1.55it/s, loss=0.15, lr=1] \nSteps: 100%|█████████▉| 1499/1500 [16:10<00:00, 1.55it/s, loss=0.15, lr=1]\nSteps: 100%|█████████▉| 1499/1500 [16:10<00:00, 1.55it/s, loss=0.105, lr=1]\nSteps: 100%|██████████| 1500/1500 [16:11<00:00, 1.55it/s, loss=0.105, lr=1]\nSteps: 100%|██████████| 1500/1500 [16:11<00:00, 1.55it/s, loss=0.0747, lr=1]Model weights saved in /tmp/train/output/sd35_large_train_replicate/pytorch_lora_weights.safetensors\nLoading pipeline components...: 0%| | 0/9 [00:00<?, ?it/s]\u001b[ALoaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of stable-diffusion-3.5-large.\nLoaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of stable-diffusion-3.5-large.\nLoading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]\u001b[A\u001b[A\nLoading checkpoint shards: 50%|█████ | 1/2 [00:05<00:05, 5.20s/it]\u001b[A\u001b[A\nLoading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.93s/it]\u001b[A\u001b[A\nLoading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.97s/it]\nLoaded text_encoder_3 as T5EncoderModel from `text_encoder_3` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 33%|███▎ | 3/9 [00:10<00:20, 3.36s/it]\u001b[A{'dual_attention_layers'} was not found in config. Values will be initialized to default values.\nLoaded transformer as SD3Transformer2DModel from `transformer` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 44%|████▍ | 4/9 [00:12<00:14, 2.90s/it]\u001b[A{'max_image_seq_len', 'base_image_seq_len', 'max_shift', 'base_shift', 'use_dynamic_shifting'} was not found in config. Values will be initialized to default values.\nLoaded scheduler as FlowMatchEulerDiscreteScheduler from `scheduler` subfolder of stable-diffusion-3.5-large.\nLoaded text_encoder_2 as CLIPTextModelWithProjection from `text_encoder_2` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 67%|██████▋ | 6/9 [00:13<00:05, 1.91s/it]\u001b[ALoaded text_encoder as CLIPTextModelWithProjection from `text_encoder` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 78%|███████▊ | 7/9 [00:14<00:03, 1.53s/it]\u001b[ALoaded vae as AutoencoderKL from `vae` subfolder of stable-diffusion-3.5-large.\nLoaded tokenizer_3 as T5TokenizerFast from `tokenizer_3` subfolder of stable-diffusion-3.5-large.\nLoading pipeline components...: 100%|██████████| 9/9 [00:14<00:00, 1.07it/s]\u001b[A\nLoading pipeline components...: 100%|██████████| 9/9 [00:14<00:00, 1.59s/it]\nLoading text_encoder.\nLoading text_encoder_2.\n 0%| | 0/1 [00:00<?, ?it/s]\u001b[A\n100%|██████████| 1/1 [00:01<00:00, 1.11s/it]\u001b[A\n100%|██████████| 1/1 [00:01<00:00, 1.11s/it]\nSteps: 100%|██████████| 1500/1500 [16:29<00:00, 1.52it/s, loss=0.0747, lr=1]\n./\n./output/\n./output/sd35_large_train_replicate/\n./output/sd35_large_train_replicate/README.md\n./output/sd35_large_train_replicate/lora.safetensors", "metrics": { "predict_time": 1045.489331425, "total_time": 1102.244748 }, "output": "https://replicate.delivery/yhqm/Wq2g9S3byRKRA19jSIgbAe4q6nXEefe2oz8XeQTKPCjXKwldC/trained_model.tar", "started_at": "2024-11-01T19:21:33.333417Z", "status": "succeeded", "urls": { "stream": "https://stream.replicate.com/v1/files/qoxq-7hrdyy752fgxwlpsj3565cc46qad5zjvvcapoigi76xv7ez4rzoa", "get": "https://api.replicate.com/v1/predictions/6ty97311w9rj00cjx4pbwngcp0", "cancel": "https://api.replicate.com/v1/predictions/6ty97311w9rj00cjx4pbwngcp0/cancel" }, "version": "6ebda45af5b9c30edee3149cc1624b7f7cae8fab7c692e2c51d82f5fed3198ee" }
Generated inUsing seed: 2592116838 Extracted 16 files from zip to input_images The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well. Token is valid (permission: write). Your token has been saved to /root/.cache/huggingface/token Login successful Using params: ['accelerate', 'launch', '--dynamo_backend', 'no', 'train_dreambooth_lora_sd3.py', '--pretrained_model_name_or_path', 'stable-diffusion-3.5-large', '--instance_data_dir', 'input_images', '--rank', '16', '--output_dir', '/tmp/train/output/sd35_large_train_replicate', '--mixed_precision', 'bf16', '--instance_prompt', 'Frog, yarn art style', '--resolution', '768', '--train_batch_size', '1', '--gradient_accumulation_steps', '1', '--optimizer', 'prodigy', '--learning_rate', '1.0', '--text_encoder_lr', '1.0', '--lr_scheduler', 'constant', '--lr_warmup_steps', '0', '--max_train_steps', '1500', '--checkpointing_steps', '1501', '--seed', '2592116838', '--logging_dir', '/tmp/logs', '--push_to_hub', '--hub_token', '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', '--hub_model_id', 'lucataco/SD3.5-Large-yarn-2', '--train_text_encoder'] 11/01/2024 19:21:43 - INFO - __main__ - Distributed environment: DistributedType.NO Num processes: 1 Process index: 0 Local process index: 0 Device: cuda Mixed precision type: bf16 You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors. You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors. You are using a model of type t5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors. {'max_image_seq_len', 'base_image_seq_len', 'max_shift', 'base_shift', 'use_dynamic_shifting'} was not found in config. Values will be initialized to default values. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|█████ | 1/2 [00:03<00:03, 3.58s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:06<00:00, 3.44s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:06<00:00, 3.46s/it] {'dual_attention_layers'} was not found in config. Values will be initialized to default values. 11/01/2024 19:22:26 - WARNING - __main__ - Learning rates were provided both for the transformer and the text encoder- e.g. text_encoder_lr: 1.0 and learning_rate: 1.0. When using prodigy only learning_rate is used as the initial learning rate. Using decoupled weight decay 11/01/2024 19:22:27 - INFO - __main__ - ***** Running training ***** 11/01/2024 19:22:27 - INFO - __main__ - Num examples = 16 11/01/2024 19:22:27 - INFO - __main__ - Num batches each epoch = 16 11/01/2024 19:22:27 - INFO - __main__ - Num Epochs = 94 11/01/2024 19:22:27 - INFO - __main__ - Instantaneous batch size per device = 1 11/01/2024 19:22:27 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 1 11/01/2024 19:22:27 - INFO - __main__ - Gradient Accumulation steps = 1 11/01/2024 19:22:27 - INFO - __main__ - Total optimization steps = 1500 Steps: 0%| | 0/1500 [00:00<?, ?it/s] Steps: 0%| | 1/1500 [00:01<42:23, 1.70s/it] Steps: 0%| | 1/1500 [00:01<42:23, 1.70s/it, loss=0.0865, lr=1] Steps: 0%| | 2/1500 [00:02<26:52, 1.08s/it, loss=0.0865, lr=1] Steps: 0%| | 2/1500 [00:02<26:52, 1.08s/it, loss=0.192, lr=1] Steps: 0%| | 3/1500 [00:02<21:54, 1.14it/s, loss=0.192, lr=1] Steps: 0%| | 3/1500 [00:02<21:54, 1.14it/s, loss=0.0926, lr=1] Steps: 0%| | 4/1500 [00:03<19:34, 1.27it/s, loss=0.0926, lr=1] Steps: 0%| | 4/1500 [00:03<19:34, 1.27it/s, loss=0.113, lr=1] Steps: 0%| | 5/ [...] log volume exceeds 256KiB size limit: truncating logs [...] 0 [00:30<15:39, 1.55it/s, loss=0.205, lr=1] Steps: 3%|▎ | 46/1500 [00:30<15:39, 1.55it/s, loss=0.26, lr=1] Steps: 3%|▎ | 47/1500 [00:31<15:39, 1.55it/s, loss=0.26, lr=1] Steps: 3%|▎ | 47/1500 [00:31<15:39, 1.55it/s, loss=0.148, lr=1] Steps: 3%|▎ | 48/1500 [00:32<15:38, 1.55it/s, loss=0.148, lr=1] Steps: 3%|▎ | 48/1500 [00:32<15:38, 1.55it/s, loss=0.264, lr=1] Steps: 3%|▎ | 49/1500 [00:32<15:44, 1.54it/s, loss=0.264, lr=1] Steps: 3%|▎ | 49/1500 [00:32<15:44, 1.54it/s, loss=0.0476, lr=1] Steps: 3%|▎ | 50/1500 [00:33<15:41, 1.54it/s, loss=0.0476, lr=1] Steps: 3%|▎ | 50/1500 [00:33<15:41, 1.54it/s, loss=0.0878, lr=1] Steps: 3%|▎ | 51/1500 [00:34<15:41, 1.54it/s, loss=0.0878, lr=1] Steps: 3%|▎ | 51/1500 [00:34<15:41, 1.54it/s, loss=0.201, lr=1] Steps: 3%|▎ | 52/1500 [00:34<15:41, 1.54it/s, loss=0.201, lr=1] Steps: 3%|▎ | 52/1500 [00:34<15:41, 1.54it/s, loss=0.114, lr=1] Steps: 4%|▎ | 53/1500 [00:35<15:38, 1.54it/s, loss=0.114, lr=1] Steps: 4%|▎ | 53/1500 [00:35<15:38, 1.54it/s, loss=0.165, lr=1] Steps: 4%|▎ | 54/1500 [00:36<15:36, 1.54it/s, loss=0.165, lr=1] Steps: 4%|▎ | 54/1500 [00:36<15:36, 1.54it/s, loss=0.148, lr=1] Steps: 4%|▎ | 55/1500 [00:36<15:35, 1.55it/s, loss=0.148, lr=1] Steps: 4%|▎ | 55/1500 [00:36<15:35, 1.55it/s, loss=0.225, lr=1] Steps: 4%|▎ | 56/1500 [00:37<15:33, 1.55it/s, loss=0.225, lr=1] Steps: 4%|▎ | 56/1500 [00:37<15:33, 1.55it/s, loss=0.242, lr=1] Steps: 4%|▍ | 57/1500 [00:38<15:32, 1.55it/s, loss=0.242, lr=1] Steps: 4%|▍ | 57/1500 [00:38<15:32, 1.55it/s, loss=0.0901, lr=1] Steps: 4%|▍ | 58/1500 [00:38<15:30, 1.55it/s, loss=0.0901, lr=1] Steps: 4%|▍ | 58/1500 [00:38<15:30, 1.55it/s, loss=0.321, lr=1] Steps: 4%|▍ | 59/1500 [00:39<15:30, 1.55it/s, loss=0.321, lr=1] Steps: 4%|▍ | 59/1500 [00:39<15:30, 1.55it/s, loss=0.105, lr=1] Steps: 4%|▍ | 60/1500 [00:39<15:29, 1.55it/s, loss=0.105, lr=1] Steps: 4%|▍ | 60/1500 [00:39<15:29, 1.55it/s, loss=0.145, lr=1] Steps: 4%|▍ | 61/1500 [00:40<15:29, 1.55it/s, loss=0.145, lr=1] Steps: 4%|▍ | 61/1500 [00:40<15:29, 1.55it/s, loss=0.108, lr=1] Steps: 4%|▍ | 62/1500 [00:41<15:29, 1.55it/s, loss=0.108, lr=1] Steps: 4%|▍ | 62/1500 [00:41<15:29, 1.55it/s, loss=0.142, lr=1] Steps: 4%|▍ | 63/1500 [00:41<15:28, 1.55it/s, loss=0.142, lr=1] Steps: 4%|▍ | 63/1500 [00:41<15:28, 1.55it/s, loss=0.311, lr=1] Steps: 4%|▍ | 64/1500 [00:42<15:27, 1.55it/s, loss=0.311, lr=1] Steps: 4%|▍ | 64/1500 [00:42<15:27, 1.55it/s, loss=0.178, lr=1] Steps: 4%|▍ | 65/1500 [00:43<15:32, 1.54it/s, loss=0.178, lr=1] Steps: 4%|▍ | 65/1500 [00:43<15:32, 1.54it/s, loss=0.221, lr=1] Steps: 4%|▍ | 66/1500 [00:43<15:29, 1.54it/s, loss=0.221, lr=1] Steps: 4%|▍ | 66/1500 [00:43<15:29, 1.54it/s, loss=0.121, lr=1] Steps: 4%|▍ | 67/1500 [00:44<15:28, 1.54it/s, loss=0.121, lr=1] Steps: 4%|▍ | 67/1500 [00:44<15:28, 1.54it/s, loss=0.151, lr=1] Steps: 5%|▍ | 68/1500 [00:45<15:26, 1.55it/s, loss=0.151, lr=1] Steps: 5%|▍ | 68/1500 [00:45<15:26, 1.55it/s, loss=0.123, lr=1] Steps: 5%|▍ | 69/1500 [00:45<15:25, 1.55it/s, loss=0.123, lr=1] Steps: 5%|▍ | 69/1500 [00:45<15:25, 1.55it/s, loss=0.164, lr=1] Steps: 5%|▍ | 70/1500 [00:46<15:23, 1.55it/s, loss=0.164, lr=1] Steps: 5%|▍ | 70/1500 [00:46<15:23, 1.55it/s, loss=0.223, lr=1] Steps: 5%|▍ | 71/1500 [00:47<15:22, 1.55it/s, loss=0.223, lr=1] Steps: 5%|▍ | 71/1500 [00:47<15:22, 1.55it/s, loss=0.121, lr=1] Steps: 5%|▍ | 72/1500 [00:47<15:20, 1.55it/s, loss=0.121, lr=1] Steps: 5%|▍ | 72/1500 [00:47<15:20, 1.55it/s, loss=0.172, lr=1] Steps: 5%|▍ | 73/1500 [00:48<15:19, 1.55it/s, loss=0.172, lr=1] Steps: 5%|▍ | 73/1500 [00:48<15:19, 1.55it/s, loss=0.185, lr=1] Steps: 5%|▍ | 74/1500 [00:49<15:17, 1.55it/s, loss=0.185, lr=1] Steps: 5%|▍ | 74/1500 [00:49<15:17, 1.55it/s, loss=0.116, lr=1] Steps: 5%|▌ | 75/1500 [00:49<15:18, 1.55it/s, loss=0.116, lr=1] Steps: 5%|▌ | 75/1500 [00:49<15:18, 1.55it/s, loss=0.0638, lr=1] Steps: 5%|▌ | 76/1500 [00:50<15:18, 1.55it/s, loss=0.0638, lr=1] Steps: 5%|▌ | 76/1500 [00:50<15:18, 1.55it/s, loss=0.192, lr=1] Steps: 5%|▌ | 77/1500 [00:50<15:16, 1.55it/s, loss=0.192, lr=1] Steps: 5%|▌ | 77/1500 [00:50<15:16, 1.55it/s, loss=0.0889, lr=1] Steps: 5%|▌ | 78/1500 [00:51<15:16, 1.55it/s, loss=0.0889, lr=1] Steps: 5%|▌ | 78/1500 [00:51<15:16, 1.55it/s, loss=0.156, lr=1] Steps: 5%|▌ | 79/1500 [00:52<15:15, 1.55it/s, loss=0.156, lr=1] Steps: 5%|▌ | 79/1500 [00:52<15:15, 1.55it/s, loss=0.0374, lr=1] Steps: 5%|▌ | 80/1500 [00:52<15:13, 1.55it/s, loss=0.0374, lr=1] Steps: 5%|▌ | 80/1500 [00:52<15:13, 1.55it/s, loss=0.0846, lr=1] Steps: 5%|▌ | 81/1500 [00:53<15:18, 1.54it/s, loss=0.0846, lr=1] Steps: 5%|▌ | 81/1500 [00:53<15:18, 1.54it/s, loss=0.143, lr=1] Steps: 5%|▌ | 82/1500 [00:54<15:19, 1.54it/s, loss=0.143, lr=1] Steps: 5%|▌ | 82/1500 [00:54<15:19, 1.54it/s, loss=0.0692, lr=1] Steps: 6%|▌ | 83/1500 [00:54<15:18, 1.54it/s, loss=0.0692, lr=1] Steps: 6%|▌ | 83/1500 [00:54<15:18, 1.54it/s, loss=0.214, lr=1] Steps: 6%|▌ | 84/1500 [00:55<15:16, 1.54it/s, loss=0.214, lr=1] Steps: 6%|▌ | 84/1500 [00:55<15:16, 1.54it/s, loss=0.0942, lr=1] Steps: 6%|▌ | 85/1500 [00:56<15:14, 1.55it/s, loss=0.0942, lr=1] Steps: 6%|▌ | 85/1500 [00:56<15:14, 1.55it/s, loss=0.207, lr=1] Steps: 6%|▌ | 86/1500 [00:56<15:13, 1.55it/s, loss=0.207, lr=1] Steps: 6%|▌ | 86/1500 [00:56<15:13, 1.55it/s, loss=0.255, lr=1] Steps: 6%|▌ | 87/1500 [00:57<15:12, 1.55it/s, loss=0.255, lr=1] Steps: 6%|▌ | 87/1500 [00:57<15:12, 1.55it/s, loss=0.111, lr=1] Steps: 6%|▌ | 88/1500 [00:58<15:12, 1.55it/s, loss=0.111, lr=1] Steps: 6%|▌ | 88/1500 [00:58<15:12, 1.55it/s, loss=0.0414, lr=1] Steps: 6%|▌ | 89/1500 [00:58<15:10, 1.55it/s, loss=0.0414, lr=1] Steps: 6%|▌ | 89/1500 [00:58<15:10, 1.55it/s, loss=0.192, lr=1] Steps: 6%|▌ | 90/1500 [00:59<15:09, 1.55it/s, loss=0.192, lr=1] Steps: 6%|▌ | 90/1500 [00:59<15:09, 1.55it/s, loss=0.105, lr=1] Steps: 6%|▌ | 91/1500 [00:59<15:09, 1.55it/s, loss=0.105, lr=1] Steps: 6%|▌ | 91/1500 [00:59<15:09, 1.55it/s, loss=0.131, lr=1] Steps: 6%|▌ | 92/1500 [01:00<15:08, 1.55it/s, loss=0.131, lr=1] Steps: 6%|▌ | 92/1500 [01:00<15:08, 1.55it/s, loss=0.0768, lr=1] Steps: 6%|▌ | 93/1500 [01:01<15:11, 1.54it/s, loss=0.0768, lr=1] Steps: 6%|▌ | 93/1500 [01:01<15:11, 1.54it/s, loss=0.0205, lr=1] Steps: 6%|▋ | 94/1500 [01:01<15:08, 1.55it/s, loss=0.0205, lr=1] Steps: 6%|▋ | 94/1500 [01:01<15:08, 1.55it/s, loss=0.137, lr=1] Steps: 6%|▋ | 95/1500 [01:02<15:08, 1.55it/s, loss=0.137, lr=1] Steps: 6%|▋ | 95/1500 [01:02<15:08, 1.55it/s, loss=0.0869, lr=1] Steps: 6%|▋ | 96/1500 [01:03<15:05, 1.55it/s, loss=0.0869, lr=1] Steps: 6%|▋ | 96/1500 [01:03<15:05, 1.55it/s, loss=0.125, lr=1] Steps: 6%|▋ | 97/1500 [01:03<15:11, 1.54it/s, loss=0.125, lr=1] Steps: 6%|▋ | 97/1500 [01:03<15:11, 1.54it/s, loss=0.169, lr=1] Steps: 7%|▋ | 98/1500 [01:04<15:10, 1.54it/s, loss=0.169, lr=1] Steps: 7%|▋ | 98/1500 [01:04<15:10, 1.54it/s, loss=0.148, lr=1] Steps: 7%|▋ | 99/1500 [01:05<15:07, 1.54it/s, loss=0.148, lr=1] Steps: 7%|▋ | 99/1500 [01:05<15:07, 1.54it/s, loss=0.311, lr=1] Steps: 7%|▋ | 100/1500 [01:05<15:10, 1.54it/s, loss=0.311, lr=1] Steps: 7%|▋ | 100/1500 [01:05<15:10, 1.54it/s, loss=0.107, lr=1] Steps: 7%|▋ | 101/1500 [01:06<15:08, 1.54it/s, loss=0.107, lr=1] Steps: 7%|▋ | 101/1500 [01:06<15:08, 1.54it/s, loss=0.17, lr=1] Steps: 7%|▋ | 102/1500 [01:07<15:09, 1.54it/s, loss=0.17, lr=1] Steps: 7%|▋ | 102/1500 [01:07<15:09, 1.54it/s, loss=0.0375, lr=1] Steps: 7%|▋ | 103/1500 [01:07<15:06, 1.54it/s, loss=0.0375, lr=1] Steps: 7%|▋ | 103/1500 [01:07<15:06, 1.54it/s, loss=0.168, lr=1] Steps: 7%|▋ | 104/1500 [01:08<15:04, 1.54it/s, loss=0.168, lr=1] Steps: 7%|▋ | 104/1500 [01:08<15:04, 1.54it/s, loss=0.111, lr=1] Steps: 7%|▋ | 105/1500 [01:09<15:01, 1.55it/s, loss=0.111, lr=1] Steps: 7%|▋ | 105/1500 [01:09<15:01, 1.55it/s, loss=0.0957, lr=1] Steps: 7%|▋ | 106/1500 [01:09<15:00, 1.55it/s, loss=0.0957, lr=1] Steps: 7%|▋ | 106/1500 [01:09<15:00, 1.55it/s, loss=0.392, lr=1] Steps: 7%|▋ | 107/1500 [01:10<14:59, 1.55it/s, loss=0.392, lr=1] Steps: 7%|▋ | 107/1500 [01:10<14:59, 1.55it/s, loss=0.279, lr=1] Steps: 7%|▋ | 108/1500 [01:11<15:00, 1.55it/s, loss=0.279, lr=1] Steps: 7%|▋ | 108/1500 [01:11<15:00, 1.55it/s, loss=0.331, lr=1] Steps: 7%|▋ | 109/1500 [01:11<14:59, 1.55it/s, loss=0.331, lr=1] Steps: 7%|▋ | 109/1500 [01:11<14:59, 1.55it/s, loss=0.236, lr=1] Steps: 7%|▋ | 110/1500 [01:12<14:59, 1.55it/s, loss=0.236, lr=1] Steps: 7%|▋ | 110/1500 [01:12<14:59, 1.55it/s, loss=0.226, lr=1] Steps: 7%|▋ | 111/1500 [01:12<14:58, 1.55it/s, loss=0.226, lr=1] Steps: 7%|▋ | 111/1500 [01:12<14:58, 1.55it/s, loss=0.0534, lr=1] Steps: 7%|▋ | 112/1500 [01:13<14:58, 1.54it/s, loss=0.0534, lr=1] Steps: 7%|▋ | 112/1500 [01:13<14:58, 1.54it/s, loss=0.264, lr=1] Steps: 8%|▊ | 113/1500 [01:14<15:03, 1.53it/s, loss=0.264, lr=1] Steps: 8%|▊ | 113/1500 [01:14<15:03, 1.53it/s, loss=0.284, lr=1] Steps: 8%|▊ | 114/1500 [01:14<15:01, 1.54it/s, loss=0.284, lr=1] Steps: 8%|▊ | 114/1500 [01:14<15:01, 1.54it/s, loss=0.138, lr=1] Steps: 8%|▊ | 115/1500 [01:15<14:58, 1.54it/s, loss=0.138, lr=1] Steps: 8%|▊ | 115/1500 [01:15<14:58, 1.54it/s, loss=0.0913, lr=1] Steps: 8%|▊ | 116/1500 [01:16<14:57, 1.54it/s, loss=0.0913, lr=1] Steps: 8%|▊ | 116/1500 [01:16<14:57, 1.54it/s, loss=0.0745, lr=1] Steps: 8%|▊ | 117/1500 [01:16<14:56, 1.54it/s, loss=0.0745, lr=1] Steps: 8%|▊ | 117/1500 [01:16<14:56, 1.54it/s, loss=0.124, lr=1] Steps: 8%|▊ | 118/1500 [01:17<14:56, 1.54it/s, loss=0.124, lr=1] Steps: 8%|▊ | 118/1500 [01:17<14:56, 1.54it/s, loss=0.22, lr=1] Steps: 8%|▊ | 119/1500 [01:18<14:55, 1.54it/s, loss=0.22, lr=1] Steps: 8%|▊ | 119/1500 [01:18<14:55, 1.54it/s, loss=0.115, lr=1] Steps: 8%|▊ | 120/1500 [01:18<14:53, 1.55it/s, loss=0.115, lr=1] Steps: 8%|▊ | 120/1500 [01:18<14:53, 1.55it/s, loss=0.137, lr=1] Steps: 8%|▊ | 121/1500 [01:19<14:51, 1.55it/s, loss=0.137, lr=1] Steps: 8%|▊ | 121/1500 [01:19<14:51, 1.55it/s, loss=0.223, lr=1] Steps: 8%|▊ | 122/1500 [01:20<14:49, 1.55it/s, loss=0.223, lr=1] Steps: 8%|▊ | 122/1500 [01:20<14:49, 1.55it/s, loss=0.121, lr=1] Steps: 8%|▊ | 123/1500 [01:20<14:48, 1.55it/s, loss=0.121, lr=1] Steps: 8%|▊ | 123/1500 [01:20<14:48, 1.55it/s, loss=0.332, lr=1] Steps: 8%|▊ | 124/1500 [01:21<14:47, 1.55it/s, loss=0.332, lr=1] Steps: 8%|▊ | 124/1500 [01:21<14:47, 1.55it/s, loss=0.054, lr=1] Steps: 8%|▊ | 125/1500 [01:22<14:46, 1.55it/s, loss=0.054, lr=1] Steps: 8%|▊ | 125/1500 [01:22<14:46, 1.55it/s, loss=0.109, lr=1] Steps: 8%|▊ | 126/1500 [01:22<14:44, 1.55it/s, loss=0.109, lr=1] Steps: 8%|▊ | 126/1500 [01:22<14:44, 1.55it/s, loss=0.302, lr=1] Steps: 8%|▊ | 127/1500 [01:23<14:45, 1.55it/s, loss=0.302, lr=1] Steps: 8%|▊ | 127/1500 [01:23<14:45, 1.55it/s, loss=0.127, lr=1] Steps: 9%|▊ | 128/1500 [01:23<14:45, 1.55it/s, loss=0.127, lr=1] Steps: 9%|▊ | 128/1500 [01:23<14:45, 1.55it/s, loss=0.064, lr=1] Steps: 9%|▊ | 129/1500 [01:24<14:49, 1.54it/s, loss=0.064, lr=1] Steps: 9%|▊ | 129/1500 [01:24<14:49, 1.54it/s, loss=0.158, lr=1] Steps: 9%|▊ | 130/1500 [01:25<14:47, 1.54it/s, loss=0.158, lr=1] Steps: 9%|▊ | 130/1500 [01:25<14:47, 1.54it/s, loss=0.11, lr=1] Steps: 9%|▊ | 131/1500 [01:25<14:46, 1.54it/s, loss=0.11, lr=1] Steps: 9%|▊ | 131/1500 [01:25<14:46, 1.54it/s, loss=0.0911, lr=1] Steps: 9%|▉ | 132/1500 [01:26<14:44, 1.55it/s, loss=0.0911, lr=1] Steps: 9%|▉ | 132/1500 [01:26<14:44, 1.55it/s, loss=0.202, lr=1] Steps: 9%|▉ | 133/1500 [01:27<14:42, 1.55it/s, loss=0.202, lr=1] Steps: 9%|▉ | 133/1500 [01:27<14:42, 1.55it/s, loss=0.136, lr=1] Steps: 9%|▉ | 134/1500 [01:27<14:41, 1.55it/s, loss=0.136, lr=1] Steps: 9%|▉ | 134/1500 [01:27<14:41, 1.55it/s, loss=0.159, lr=1] Steps: 9%|▉ | 135/1500 [01:28<14:41, 1.55it/s, loss=0.159, lr=1] Steps: 9%|▉ | 135/1500 [01:28<14:41, 1.55it/s, loss=0.128, lr=1] Steps: 9%|▉ | 136/1500 [01:29<14:39, 1.55it/s, loss=0.128, lr=1] Steps: 9%|▉ | 136/1500 [01:29<14:39, 1.55it/s, loss=0.256, lr=1] Steps: 9%|▉ | 137/1500 [01:29<14:38, 1.55it/s, loss=0.256, lr=1] Steps: 9%|▉ | 137/1500 [01:29<14:38, 1.55it/s, loss=0.0531, lr=1] Steps: 9%|▉ | 138/1500 [01:30<14:38, 1.55it/s, loss=0.0531, lr=1] Steps: 9%|▉ | 138/1500 [01:30<14:38, 1.55it/s, loss=0.126, lr=1] Steps: 9%|▉ | 139/1500 [01:31<14:38, 1.55it/s, loss=0.126, lr=1] Steps: 9%|▉ | 139/1500 [01:31<14:38, 1.55it/s, loss=0.13, lr=1] Steps: 9%|▉ | 140/1500 [01:31<14:36, 1.55it/s, loss=0.13, lr=1] Steps: 9%|▉ | 140/1500 [01:31<14:36, 1.55it/s, loss=0.154, lr=1] Steps: 9%|▉ | 141/1500 [01:32<14:36, 1.55it/s, loss=0.154, lr=1] Steps: 9%|▉ | 141/1500 [01:32<14:36, 1.55it/s, loss=0.242, lr=1] Steps: 9%|▉ | 142/1500 [01:32<14:36, 1.55it/s, loss=0.242, lr=1] Steps: 9%|▉ | 142/1500 [01:32<14:36, 1.55it/s, loss=0.194, lr=1] Steps: 10%|▉ | 143/1500 [01:33<14:35, 1.55it/s, loss=0.194, lr=1] Steps: 10%|▉ | 143/1500 [01:33<14:35, 1.55it/s, loss=0.254, lr=1] Steps: 10%|▉ | 144/1500 [01:34<14:35, 1.55it/s, loss=0.254, lr=1] Steps: 10%|▉ | 144/1500 [01:34<14:35, 1.55it/s, loss=0.185, lr=1] Steps: 10%|▉ | 145/1500 [01:34<14:41, 1.54it/s, loss=0.185, lr=1] Steps: 10%|▉ | 145/1500 [01:34<14:41, 1.54it/s, loss=0.163, lr=1] Steps: 10%|▉ | 146/1500 [01:35<14:41, 1.54it/s, loss=0.163, lr=1] Steps: 10%|▉ | 146/1500 [01:35<14:41, 1.54it/s, loss=0.153, lr=1] Steps: 10%|▉ | 147/1500 [01:36<14:39, 1.54it/s, loss=0.153, lr=1] Steps: 10%|▉ | 147/1500 [01:36<14:39, 1.54it/s, loss=0.331, lr=1] Steps: 10%|▉ | 148/1500 [01:36<14:37, 1.54it/s, loss=0.331, lr=1] Steps: 10%|▉ | 148/1500 [01:36<14:37, 1.54it/s, loss=0.0639, lr=1] Steps: 10%|▉ | 149/1500 [01:37<14:36, 1.54it/s, loss=0.0639, lr=1] Steps: 10%|▉ | 149/1500 [01:37<14:36, 1.54it/s, loss=0.188, lr=1] Steps: 10%|█ | 150/1500 [01:38<14:33, 1.54it/s, loss=0.188, lr=1] Steps: 10%|█ | 150/1500 [01:38<14:33, 1.54it/s, loss=0.121, lr=1] Steps: 10%|█ | 151/1500 [01:38<14:32, 1.55it/s, loss=0.121, lr=1] Steps: 10%|█ | 151/1500 [01:38<14:32, 1.55it/s, loss=0.133, lr=1] Steps: 10%|█ | 152/1500 [01:39<14:31, 1.55it/s, loss=0.133, lr=1] Steps: 10%|█ | 152/1500 [01:39<14:31, 1.55it/s, loss=0.189, lr=1] Steps: 10%|█ | 153/1500 [01:40<14:30, 1.55it/s, loss=0.189, lr=1] Steps: 10%|█ | 153/1500 [01:40<14:30, 1.55it/s, loss=0.169, lr=1] Steps: 10%|█ | 154/1500 [01:40<14:29, 1.55it/s, loss=0.169, lr=1] Steps: 10%|█ | 154/1500 [01:40<14:29, 1.55it/s, loss=0.152, lr=1] Steps: 10%|█ | 155/1500 [01:41<14:29, 1.55it/s, loss=0.152, lr=1] Steps: 10%|█ | 155/1500 [01:41<14:29, 1.55it/s, loss=0.159, lr=1] Steps: 10%|█ | 156/1500 [01:42<14:28, 1.55it/s, loss=0.159, lr=1] Steps: 10%|█ | 156/1500 [01:42<14:28, 1.55it/s, loss=0.139, lr=1] Steps: 10%|█ | 157/1500 [01:42<14:27, 1.55it/s, loss=0.139, lr=1] Steps: 10%|█ | 157/1500 [01:42<14:27, 1.55it/s, loss=0.15, lr=1] Steps: 11%|█ | 158/1500 [01:43<14:26, 1.55it/s, loss=0.15, lr=1] Steps: 11%|█ | 158/1500 [01:43<14:26, 1.55it/s, loss=0.164, lr=1] Steps: 11%|█ | 159/1500 [01:43<14:26, 1.55it/s, loss=0.164, lr=1] Steps: 11%|█ | 159/1500 [01:43<14:26, 1.55it/s, loss=0.171, lr=1] Steps: 11%|█ | 160/1500 [01:44<14:24, 1.55it/s, loss=0.171, lr=1] Steps: 11%|█ | 160/1500 [01:44<14:24, 1.55it/s, loss=0.22, lr=1] Steps: 11%|█ | 161/1500 [01:45<14:29, 1.54it/s, loss=0.22, lr=1] Steps: 11%|█ | 161/1500 [01:45<14:29, 1.54it/s, loss=0.169, lr=1] Steps: 11%|█ | 162/1500 [01:45<14:27, 1.54it/s, loss=0.169, lr=1] Steps: 11%|█ | 162/1500 [01:45<14:27, 1.54it/s, loss=0.116, lr=1] Steps: 11%|█ | 163/1500 [01:46<14:25, 1.54it/s, loss=0.116, lr=1] Steps: 11%|█ | 163/1500 [01:46<14:25, 1.54it/s, loss=0.11, lr=1] Steps: 11%|█ | 164/1500 [01:47<14:24, 1.55it/s, loss=0.11, lr=1] Steps: 11%|█ | 164/1500 [01:47<14:24, 1.55it/s, loss=0.143, lr=1] Steps: 11%|█ | 165/1500 [01:47<14:22, 1.55it/s, loss=0.143, lr=1] Steps: 11%|█ | 165/1500 [01:47<14:22, 1.55it/s, loss=0.108, lr=1] Steps: 11%|█ | 166/1500 [01:48<14:21, 1.55it/s, loss=0.108, lr=1] Steps: 11%|█ | 166/1500 [01:48<14:21, 1.55it/s, loss=0.154, lr=1] Steps: 11%|█ | 167/1500 [01:49<14:20, 1.55it/s, loss=0.154, lr=1] Steps: 11%|█ | 167/1500 [01:49<14:20, 1.55it/s, loss=0.1, lr=1] Steps: 11%|█ | 168/1500 [01:49<14:19, 1.55it/s, loss=0.1, lr=1] Steps: 11%|█ | 168/1500 [01:49<14:19, 1.55it/s, loss=0.195, lr=1] Steps: 11%|█▏ | 169/1500 [01:50<14:19, 1.55it/s, loss=0.195, lr=1] Steps: 11%|█▏ | 169/1500 [01:50<14:19, 1.55it/s, loss=0.155, lr=1] Steps: 11%|█▏ | 170/1500 [01:51<14:18, 1.55it/s, loss=0.155, lr=1] Steps: 11%|█▏ | 170/1500 [01:51<14:18, 1.55it/s, loss=0.241, lr=1] Steps: 11%|█▏ | 171/1500 [01:51<14:17, 1.55it/s, loss=0.241, lr=1] Steps: 11%|█▏ | 171/1500 [01:51<14:17, 1.55it/s, loss=0.115, lr=1] Steps: 11%|█▏ | 172/1500 [01:52<14:16, 1.55it/s, loss=0.115, lr=1] Steps: 11%|█▏ | 172/1500 [01:52<14:16, 1.55it/s, loss=0.167, lr=1] Steps: 12%|█▏ | 173/1500 [01:53<14:15, 1.55it/s, loss=0.167, lr=1] Steps: 12%|█▏ | 173/1500 [01:53<14:15, 1.55it/s, loss=0.261, lr=1] Steps: 12%|█▏ | 174/1500 [01:53<14:14, 1.55it/s, loss=0.261, lr=1] Steps: 12%|█▏ | 174/1500 [01:53<14:14, 1.55it/s, loss=0.209, lr=1] Steps: 12%|█▏ | 175/1500 [01:54<14:15, 1.55it/s, loss=0.209, lr=1] Steps: 12%|█▏ | 175/1500 [01:54<14:15, 1.55it/s, loss=0.15, lr=1] Steps: 12%|█▏ | 176/1500 [01:54<14:14, 1.55it/s, loss=0.15, lr=1] Steps: 12%|█▏ | 176/1500 [01:54<14:14, 1.55it/s, loss=0.213, lr=1] Steps: 12%|█▏ | 177/1500 [01:55<14:18, 1.54it/s, loss=0.213, lr=1] Steps: 12%|█▏ | 177/1500 [01:55<14:18, 1.54it/s, loss=0.223, lr=1] Steps: 12%|█▏ | 178/1500 [01:56<14:18, 1.54it/s, loss=0.223, lr=1] Steps: 12%|█▏ | 178/1500 [01:56<14:18, 1.54it/s, loss=0.122, lr=1] Steps: 12%|█▏ | 179/1500 [01:56<14:15, 1.54it/s, loss=0.122, lr=1] Steps: 12%|█▏ | 179/1500 [01:56<14:15, 1.54it/s, loss=0.193, lr=1] Steps: 12%|█▏ | 180/1500 [01:57<14:13, 1.55it/s, loss=0.193, lr=1] Steps: 12%|█▏ | 180/1500 [01:57<14:13, 1.55it/s, loss=0.174, lr=1] Steps: 12%|█▏ | 181/1500 [01:58<14:12, 1.55it/s, loss=0.174, lr=1] Steps: 12%|█▏ | 181/1500 [01:58<14:12, 1.55it/s, loss=0.0489, lr=1] Steps: 12%|█▏ | 182/1500 [01:58<14:11, 1.55it/s, loss=0.0489, lr=1] Steps: 12%|█▏ | 182/1500 [01:58<14:11, 1.55it/s, loss=0.179, lr=1] Steps: 12%|█▏ | 183/1500 [01:59<14:09, 1.55it/s, loss=0.179, lr=1] Steps: 12%|█▏ | 183/1500 [01:59<14:09, 1.55it/s, loss=0.111, lr=1] Steps: 12%|█▏ | 184/1500 [02:00<14:08, 1.55it/s, loss=0.111, lr=1] Steps: 12%|█▏ | 184/1500 [02:00<14:08, 1.55it/s, loss=0.166, lr=1] Steps: 12%|█▏ | 185/1500 [02:00<14:07, 1.55it/s, loss=0.166, lr=1] Steps: 12%|█▏ | 185/1500 [02:00<14:07, 1.55it/s, loss=0.094, lr=1] Steps: 12%|█▏ | 186/1500 [02:01<14:07, 1.55it/s, loss=0.094, lr=1] Steps: 12%|█▏ | 186/1500 [02:01<14:07, 1.55it/s, loss=0.18, lr=1] Steps: 12%|█▏ | 187/1500 [02:02<14:06, 1.55it/s, loss=0.18, lr=1] Steps: 12%|█▏ | 187/1500 [02:02<14:06, 1.55it/s, loss=0.0663, lr=1] Steps: 13%|█▎ | 188/1500 [02:02<14:06, 1.55it/s, loss=0.0663, lr=1] Steps: 13%|█▎ | 188/1500 [02:02<14:06, 1.55it/s, loss=0.0255, lr=1] Steps: 13%|█▎ | 189/1500 [02:03<14:06, 1.55it/s, loss=0.0255, lr=1] Steps: 13%|█▎ | 189/1500 [02:03<14:06, 1.55it/s, loss=0.218, lr=1] Steps: 13%|█▎ | 190/1500 [02:04<14:05, 1.55it/s, loss=0.218, lr=1] Steps: 13%|█▎ | 190/1500 [02:04<14:05, 1.55it/s, loss=0.0205, lr=1] Steps: 13%|█▎ | 191/1500 [02:04<14:03, 1.55it/s, loss=0.0205, lr=1] Steps: 13%|█▎ | 191/1500 [02:04<14:03, 1.55it/s, loss=0.165, lr=1] Steps: 13%|█▎ | 192/1500 [02:05<14:04, 1.55it/s, loss=0.165, lr=1] Steps: 13%|█▎ | 192/1500 [02:05<14:04, 1.55it/s, loss=0.261, lr=1] Steps: 13%|█▎ | 193/1500 [02:05<14:08, 1.54it/s, loss=0.261, lr=1] Steps: 13%|█▎ | 193/1500 [02:05<14:08, 1.54it/s, loss=0.105, lr=1] Steps: 13%|█▎ | 194/1500 [02:06<14:06, 1.54it/s, loss=0.105, lr=1] Steps: 13%|█▎ | 194/1500 [02:06<14:06, 1.54it/s, loss=0.17, lr=1] Steps: 13%|█▎ | 195/1500 [02:07<14:04, 1.55it/s, loss=0.17, lr=1] Steps: 13%|█▎ | 195/1500 [02:07<14:04, 1.55it/s, loss=0.0959, lr=1] Steps: 13%|█▎ | 196/1500 [02:07<14:02, 1.55it/s, loss=0.0959, lr=1] Steps: 13%|█▎ | 196/1500 [02:07<14:02, 1.55it/s, loss=0.129, lr=1] Steps: 13%|█▎ | 197/1500 [02:08<14:01, 1.55it/s, loss=0.129, lr=1] Steps: 13%|█▎ | 197/1500 [02:08<14:01, 1.55it/s, loss=0.155, lr=1] Steps: 13%|█▎ | 198/1500 [02:09<13:59, 1.55it/s, loss=0.155, lr=1] Steps: 13%|█▎ | 198/1500 [02:09<13:59, 1.55it/s, loss=0.0291, lr=1] Steps: 13%|█▎ | 199/1500 [02:09<13:59, 1.55it/s, loss=0.0291, lr=1] Steps: 13%|█▎ | 199/1500 [02:09<13:59, 1.55it/s, loss=0.0118, lr=1] Steps: 13%|█▎ | 200/1500 [02:10<13:58, 1.55it/s, loss=0.0118, lr=1] Steps: 13%|█▎ | 200/1500 [02:10<13:58, 1.55it/s, loss=0.241, lr=1] Steps: 13%|█▎ | 201/1500 [02:11<13:57, 1.55it/s, loss=0.241, lr=1] Steps: 13%|█▎ | 201/1500 [02:11<13:57, 1.55it/s, loss=0.0772, lr=1] Steps: 13%|█▎ | 202/1500 [02:11<13:56, 1.55it/s, loss=0.0772, lr=1] Steps: 13%|█▎ | 202/1500 [02:11<13:56, 1.55it/s, loss=0.184, lr=1] Steps: 14%|█▎ | 203/1500 [02:12<13:56, 1.55it/s, loss=0.184, lr=1] Steps: 14%|█▎ | 203/1500 [02:12<13:56, 1.55it/s, loss=0.0356, lr=1] Steps: 14%|█▎ | 204/1500 [02:13<13:54, 1.55it/s, loss=0.0356, lr=1] Steps: 14%|█▎ | 204/1500 [02:13<13:54, 1.55it/s, loss=0.114, lr=1] Steps: 14%|█▎ | 205/1500 [02:13<13:53, 1.55it/s, loss=0.114, lr=1] Steps: 14%|█▎ | 205/1500 [02:13<13:53, 1.55it/s, loss=0.133, lr=1] Steps: 14%|█▎ | 206/1500 [02:14<13:53, 1.55it/s, loss=0.133, lr=1] Steps: 14%|█▎ | 206/1500 [02:14<13:53, 1.55it/s, loss=0.327, lr=1] Steps: 14%|█▍ | 207/1500 [02:14<13:53, 1.55it/s, loss=0.327, lr=1] Steps: 14%|█▍ | 207/1500 [02:14<13:53, 1.55it/s, loss=0.0946, lr=1] Steps: 14%|█▍ | 208/1500 [02:15<13:52, 1.55it/s, loss=0.0946, lr=1] Steps: 14%|█▍ | 208/1500 [02:15<13:52, 1.55it/s, loss=0.167, lr=1] Steps: 14%|█▍ | 209/1500 [02:16<13:58, 1.54it/s, loss=0.167, lr=1] Steps: 14%|█▍ | 209/1500 [02:16<13:58, 1.54it/s, loss=0.0422, lr=1] Steps: 14%|█▍ | 210/1500 [02:16<13:56, 1.54it/s, loss=0.0422, lr=1] Steps: 14%|█▍ | 210/1500 [02:16<13:56, 1.54it/s, loss=0.117, lr=1] Steps: 14%|█▍ | 211/1500 [02:17<13:54, 1.54it/s, loss=0.117, lr=1] Steps: 14%|█▍ | 211/1500 [02:17<13:54, 1.54it/s, loss=0.0348, lr=1] Steps: 14%|█▍ | 212/1500 [02:18<13:53, 1.55it/s, loss=0.0348, lr=1] Steps: 14%|█▍ | 212/1500 [02:18<13:53, 1.55it/s, loss=0.171, lr=1] Steps: 14%|█▍ | 213/1500 [02:18<13:51, 1.55it/s, loss=0.171, lr=1] Steps: 14%|█▍ | 213/1500 [02:18<13:51, 1.55it/s, loss=0.0696, lr=1] Steps: 14%|█▍ | 214/1500 [02:19<13:50, 1.55it/s, loss=0.0696, lr=1] Steps: 14%|█▍ | 214/1500 [02:19<13:50, 1.55it/s, loss=0.0846, lr=1] Steps: 14%|█▍ | 215/1500 [02:20<13:48, 1.55it/s, loss=0.0846, lr=1] Steps: 14%|█▍ | 215/1500 [02:20<13:48, 1.55it/s, loss=0.146, lr=1] Steps: 14%|█▍ | 216/1500 [02:20<13:49, 1.55it/s, loss=0.146, lr=1] Steps: 14%|█▍ | 216/1500 [02:20<13:49, 1.55it/s, loss=0.121, lr=1] Steps: 14%|█▍ | 217/1500 [02:21<13:48, 1.55it/s, loss=0.121, lr=1] Steps: 14%|█▍ | 217/1500 [02:21<13:48, 1.55it/s, loss=0.246, lr=1] Steps: 15%|█▍ | 218/1500 [02:22<13:47, 1.55it/s, loss=0.246, lr=1] Steps: 15%|█▍ | 218/1500 [02:22<13:47, 1.55it/s, loss=0.115, lr=1] Steps: 15%|█▍ | 219/1500 [02:22<13:45, 1.55it/s, loss=0.115, lr=1] Steps: 15%|█▍ | 219/1500 [02:22<13:45, 1.55it/s, loss=0.138, lr=1] Steps: 15%|█▍ | 220/1500 [02:23<13:45, 1.55it/s, loss=0.138, lr=1] Steps: 15%|█▍ | 220/1500 [02:23<13:45, 1.55it/s, loss=0.0753, lr=1] Steps: 15%|█▍ | 221/1500 [02:24<13:44, 1.55it/s, loss=0.0753, lr=1] Steps: 15%|█▍ | 221/1500 [02:24<13:44, 1.55it/s, loss=0.167, lr=1] Steps: 15%|█▍ | 222/1500 [02:24<13:44, 1.55it/s, loss=0.167, lr=1] Steps: 15%|█▍ | 222/1500 [02:24<13:44, 1.55it/s, loss=0.162, lr=1] Steps: 15%|█▍ | 223/1500 [02:25<13:43, 1.55it/s, loss=0.162, lr=1] Steps: 15%|█▍ | 223/1500 [02:25<13:43, 1.55it/s, loss=0.18, lr=1] Steps: 15%|█▍ | 224/1500 [02:25<13:42, 1.55it/s, loss=0.18, lr=1] Steps: 15%|█▍ | 224/1500 [02:25<13:42, 1.55it/s, loss=0.0729, lr=1] Steps: 15%|█▌ | 225/1500 [02:26<13:45, 1.54it/s, loss=0.0729, lr=1] Steps: 15%|█▌ | 225/1500 [02:26<13:45, 1.54it/s, loss=0.571, lr=1] Steps: 15%|█▌ | 226/1500 [02:27<13:44, 1.55it/s, loss=0.571, lr=1] Steps: 15%|█▌ | 226/1500 [02:27<13:44, 1.55it/s, loss=0.0794, lr=1] Steps: 15%|█▌ | 227/1500 [02:27<13:43, 1.55it/s, loss=0.0794, lr=1] Steps: 15%|█▌ | 227/1500 [02:27<13:43, 1.55it/s, loss=0.0834, lr=1] Steps: 15%|█▌ | 228/1500 [02:28<13:42, 1.55it/s, loss=0.0834, lr=1] Steps: 15%|█▌ | 228/1500 [02:28<13:42, 1.55it/s, loss=0.209, lr=1] Steps: 15%|█▌ | 229/1500 [02:29<13:41, 1.55it/s, loss=0.209, lr=1] Steps: 15%|█▌ | 229/1500 [02:29<13:41, 1.55it/s, loss=0.216, lr=1] Steps: 15%|█▌ | 230/1500 [02:29<13:40, 1.55it/s, loss=0.216, lr=1] Steps: 15%|█▌ | 230/1500 [02:29<13:40, 1.55it/s, loss=0.17, lr=1] Steps: 15%|█▌ | 231/1500 [02:30<13:39, 1.55it/s, loss=0.17, lr=1] Steps: 15%|█▌ | 231/1500 [02:30<13:39, 1.55it/s, loss=0.169, lr=1] Steps: 15%|█▌ | 232/1500 [02:31<13:38, 1.55it/s, loss=0.169, lr=1] Steps: 15%|█▌ | 232/1500 [02:31<13:38, 1.55it/s, loss=0.23, lr=1] Steps: 16%|█▌ | 233/1500 [02:31<13:37, 1.55it/s, loss=0.23, lr=1] Steps: 16%|█▌ | 233/1500 [02:31<13:37, 1.55it/s, loss=0.139, lr=1] Steps: 16%|█▌ | 234/1500 [02:32<13:36, 1.55it/s, loss=0.139, lr=1] Steps: 16%|█▌ | 234/1500 [02:32<13:36, 1.55it/s, loss=0.121, lr=1] Steps: 16%|█▌ | 235/1500 [02:33<13:36, 1.55it/s, loss=0.121, lr=1] Steps: 16%|█▌ | 235/1500 [02:33<13:36, 1.55it/s, loss=0.155, lr=1] Steps: 16%|█▌ | 236/1500 [02:33<13:34, 1.55it/s, loss=0.155, lr=1] Steps: 16%|█▌ | 236/1500 [02:33<13:34, 1.55it/s, loss=0.135, lr=1] Steps: 16%|█▌ | 237/1500 [02:34<13:34, 1.55it/s, loss=0.135, lr=1] Steps: 16%|█▌ | 237/1500 [02:34<13:34, 1.55it/s, loss=0.144, lr=1] Steps: 16%|█▌ | 238/1500 [02:35<13:35, 1.55it/s, loss=0.144, lr=1] Steps: 16%|█▌ | 238/1500 [02:35<13:35, 1.55it/s, loss=0.209, lr=1] Steps: 16%|█▌ | 239/1500 [02:35<13:35, 1.55it/s, loss=0.209, lr=1] Steps: 16%|█▌ | 239/1500 [02:35<13:35, 1.55it/s, loss=0.207, lr=1] Steps: 16%|█▌ | 240/1500 [02:36<13:34, 1.55it/s, loss=0.207, lr=1] Steps: 16%|█▌ | 240/1500 [02:36<13:34, 1.55it/s, loss=0.15, lr=1] Steps: 16%|█▌ | 241/1500 [02:36<13:38, 1.54it/s, loss=0.15, lr=1] Steps: 16%|█▌ | 241/1500 [02:36<13:38, 1.54it/s, loss=0.0908, lr=1] Steps: 16%|█▌ | 242/1500 [02:37<13:36, 1.54it/s, loss=0.0908, lr=1] Steps: 16%|█▌ | 242/1500 [02:37<13:36, 1.54it/s, loss=0.254, lr=1] Steps: 16%|█▌ | 243/1500 [02:38<13:35, 1.54it/s, loss=0.254, lr=1] Steps: 16%|█▌ | 243/1500 [02:38<13:35, 1.54it/s, loss=0.108, lr=1] Steps: 16%|█▋ | 244/1500 [02:38<13:32, 1.54it/s, loss=0.108, lr=1] Steps: 16%|█▋ | 244/1500 [02:38<13:32, 1.54it/s, loss=0.167, lr=1] Steps: 16%|█▋ | 245/1500 [02:39<13:30, 1.55it/s, loss=0.167, lr=1] Steps: 16%|█▋ | 245/1500 [02:39<13:30, 1.55it/s, loss=0.127, lr=1] Steps: 16%|█▋ | 246/1500 [02:40<13:29, 1.55it/s, loss=0.127, lr=1] Steps: 16%|█▋ | 246/1500 [02:40<13:29, 1.55it/s, loss=0.0929, lr=1] Steps: 16%|█▋ | 247/1500 [02:40<13:29, 1.55it/s, loss=0.0929, lr=1] Steps: 16%|█▋ | 247/1500 [02:40<13:29, 1.55it/s, loss=0.125, lr=1] Steps: 17%|█▋ | 248/1500 [02:41<13:28, 1.55it/s, loss=0.125, lr=1] Steps: 17%|█▋ | 248/1500 [02:41<13:28, 1.55it/s, loss=0.161, lr=1] Steps: 17%|█▋ | 249/1500 [02:42<13:26, 1.55it/s, loss=0.161, lr=1] Steps: 17%|█▋ | 249/1500 [02:42<13:26, 1.55it/s, loss=0.138, lr=1] Steps: 17%|█▋ | 250/1500 [02:42<13:26, 1.55it/s, loss=0.138, lr=1] Steps: 17%|█▋ | 250/1500 [02:42<13:26, 1.55it/s, loss=0.322, lr=1] Steps: 17%|█▋ | 251/1500 [02:43<13:26, 1.55it/s, loss=0.322, lr=1] Steps: 17%|█▋ | 251/1500 [02:43<13:26, 1.55it/s, loss=0.136, lr=1] Steps: 17%|█▋ | 252/1500 [02:44<13:24, 1.55it/s, loss=0.136, lr=1] Steps: 17%|█▋ | 252/1500 [02:44<13:24, 1.55it/s, loss=0.121, lr=1] Steps: 17%|█▋ | 253/1500 [02:44<13:24, 1.55it/s, loss=0.121, lr=1] Steps: 17%|█▋ | 253/1500 [02:44<13:24, 1.55it/s, loss=0.148, lr=1] Steps: 17%|█▋ | 254/1500 [02:45<13:23, 1.55it/s, loss=0.148, lr=1] Steps: 17%|█▋ | 254/1500 [02:45<13:23, 1.55it/s, loss=0.166, lr=1] Steps: 17%|█▋ | 255/1500 [02:45<13:22, 1.55it/s, loss=0.166, lr=1] Steps: 17%|█▋ | 255/1500 [02:45<13:22, 1.55it/s, loss=0.148, lr=1] Steps: 17%|█▋ | 256/1500 [02:46<13:20, 1.55it/s, loss=0.148, lr=1] Steps: 17%|█▋ | 256/1500 [02:46<13:20, 1.55it/s, loss=0.0994, lr=1] Steps: 17%|█▋ | 257/1500 [02:47<13:26, 1.54it/s, loss=0.0994, lr=1] Steps: 17%|█▋ | 257/1500 [02:47<13:26, 1.54it/s, loss=0.0712, lr=1] Steps: 17%|█▋ | 258/1500 [02:47<13:24, 1.54it/s, loss=0.0712, lr=1] Steps: 17%|█▋ | 258/1500 [02:47<13:24, 1.54it/s, loss=0.203, lr=1] Steps: 17%|█▋ | 259/1500 [02:48<13:21, 1.55it/s, loss=0.203, lr=1] Steps: 17%|█▋ | 259/1500 [02:48<13:21, 1.55it/s, loss=0.141, lr=1] Steps: 17%|█▋ | 260/1500 [02:49<13:20, 1.55it/s, loss=0.141, lr=1] Steps: 17%|█▋ | 260/1500 [02:49<13:20, 1.55it/s, loss=0.0987, lr=1] Steps: 17%|█▋ | 261/1500 [02:49<13:19, 1.55it/s, loss=0.0987, lr=1] Steps: 17%|█▋ | 261/1500 [02:49<13:19, 1.55it/s, loss=0.128, lr=1] Steps: 17%|█▋ | 262/1500 [02:50<13:18, 1.55it/s, loss=0.128, lr=1] Steps: 17%|█▋ | 262/1500 [02:50<13:18, 1.55it/s, loss=0.18, lr=1] Steps: 18%|█▊ | 263/1500 [02:51<13:17, 1.55it/s, loss=0.18, lr=1] Steps: 18%|█▊ | 263/1500 [02:51<13:17, 1.55it/s, loss=0.119, lr=1] Steps: 18%|█▊ | 264/1500 [02:51<13:15, 1.55it/s, loss=0.119, lr=1] Steps: 18%|█▊ | 264/1500 [02:51<13:15, 1.55it/s, loss=0.12, lr=1] Steps: 18%|█▊ | 265/1500 [02:52<13:14, 1.55it/s, loss=0.12, lr=1] Steps: 18%|█▊ | 265/1500 [02:52<13:14, 1.55it/s, loss=0.0896, lr=1] Steps: 18%|█▊ | 266/1500 [02:53<13:14, 1.55it/s, loss=0.0896, lr=1] Steps: 18%|█▊ | 266/1500 [02:53<13:14, 1.55it/s, loss=0.169, lr=1] Steps: 18%|█▊ | 267/1500 [02:53<13:14, 1.55it/s, loss=0.169, lr=1] Steps: 18%|█▊ | 267/1500 [02:53<13:14, 1.55it/s, loss=0.0474, lr=1] Steps: 18%|█▊ | 268/1500 [02:54<13:17, 1.55it/s, loss=0.0474, lr=1] Steps: 18%|█▊ | 268/1500 [02:54<13:17, 1.55it/s, loss=0.231, lr=1] Steps: 18%|█▊ | 269/1500 [02:55<13:16, 1.55it/s, loss=0.231, lr=1] Steps: 18%|█▊ | 269/1500 [02:55<13:16, 1.55it/s, loss=0.134, lr=1] Steps: 18%|█▊ | 270/1500 [02:55<13:15, 1.55it/s, loss=0.134, lr=1] Steps: 18%|█▊ | 270/1500 [02:55<13:15, 1.55it/s, loss=0.196, lr=1] Steps: 18%|█▊ | 271/1500 [02:56<13:15, 1.54it/s, loss=0.196, lr=1] Steps: 18%|█▊ | 271/1500 [02:56<13:15, 1.54it/s, loss=0.0384, lr=1] Steps: 18%|█▊ | 272/1500 [02:56<13:13, 1.55it/s, loss=0.0384, lr=1] Steps: 18%|█▊ | 272/1500 [02:56<13:13, 1.55it/s, loss=0.157, lr=1] Steps: 18%|█▊ | 273/1500 [02:57<13:17, 1.54it/s, loss=0.157, lr=1] Steps: 18%|█▊ | 273/1500 [02:57<13:17, 1.54it/s, loss=0.135, lr=1] Steps: 18%|█▊ | 274/1500 [02:58<13:14, 1.54it/s, loss=0.135, lr=1] Steps: 18%|█▊ | 274/1500 [02:58<13:14, 1.54it/s, loss=0.149, lr=1] Steps: 18%|█▊ | 275/1500 [02:58<13:12, 1.55it/s, loss=0.149, lr=1] Steps: 18%|█▊ | 275/1500 [02:58<13:12, 1.55it/s, loss=0.138, lr=1] Steps: 18%|█▊ | 276/1500 [02:59<13:12, 1.55it/s, loss=0.138, lr=1] Steps: 18%|█▊ | 276/1500 [02:59<13:12, 1.55it/s, loss=0.304, lr=1] Steps: 18%|█▊ | 277/1500 [03:00<13:10, 1.55it/s, loss=0.304, lr=1] Steps: 18%|█▊ | 277/1500 [03:00<13:10, 1.55it/s, loss=0.189, lr=1] Steps: 19%|█▊ | 278/1500 [03:00<13:09, 1.55it/s, loss=0.189, lr=1] Steps: 19%|█▊ | 278/1500 [03:00<13:09, 1.55it/s, loss=0.256, lr=1] Steps: 19%|█▊ | 279/1500 [03:01<13:08, 1.55it/s, loss=0.256, lr=1] Steps: 19%|█▊ | 279/1500 [03:01<13:08, 1.55it/s, loss=0.0318, lr=1] Steps: 19%|█▊ | 280/1500 [03:02<13:07, 1.55it/s, loss=0.0318, lr=1] Steps: 19%|█▊ | 280/1500 [03:02<13:07, 1.55it/s, loss=0.0892, lr=1] Steps: 19%|█▊ | 281/1500 [03:02<13:05, 1.55it/s, loss=0.0892, lr=1] Steps: 19%|█▊ | 281/1500 [03:02<13:05, 1.55it/s, loss=0.191, lr=1] Steps: 19%|█▉ | 282/1500 [03:03<13:05, 1.55it/s, loss=0.191, lr=1] Steps: 19%|█▉ | 282/1500 [03:03<13:05, 1.55it/s, loss=0.132, lr=1] Steps: 19%|█▉ | 283/1500 [03:04<13:04, 1.55it/s, loss=0.132, lr=1] Steps: 19%|█▉ | 283/1500 [03:04<13:04, 1.55it/s, loss=0.12, lr=1] Steps: 19%|█▉ | 284/1500 [03:04<13:05, 1.55it/s, loss=0.12, lr=1] Steps: 19%|█▉ | 284/1500 [03:04<13:05, 1.55it/s, loss=0.18, lr=1] Steps: 19%|█▉ | 285/1500 [03:05<13:05, 1.55it/s, loss=0.18, lr=1] Steps: 19%|█▉ | 285/1500 [03:05<13:05, 1.55it/s, loss=0.109, lr=1] Steps: 19%|█▉ | 286/1500 [03:06<13:03, 1.55it/s, loss=0.109, lr=1] Steps: 19%|█▉ | 286/1500 [03:06<13:03, 1.55it/s, loss=0.0822, lr=1] Steps: 19%|█▉ | 287/1500 [03:06<13:02, 1.55it/s, loss=0.0822, lr=1] Steps: 19%|█▉ | 287/1500 [03:06<13:02, 1.55it/s, loss=0.0585, lr=1] Steps: 19%|█▉ | 288/1500 [03:07<13:01, 1.55it/s, loss=0.0585, lr=1] Steps: 19%|█▉ | 288/1500 [03:07<13:01, 1.55it/s, loss=0.266, lr=1] Steps: 19%|█▉ | 289/1500 [03:07<13:05, 1.54it/s, loss=0.266, lr=1] Steps: 19%|█▉ | 289/1500 [03:07<13:05, 1.54it/s, loss=0.141, lr=1] Steps: 19%|█▉ | 290/1500 [03:08<13:03, 1.55it/s, loss=0.141, lr=1] Steps: 19%|█▉ | 290/1500 [03:08<13:03, 1.55it/s, loss=0.2, lr=1] Steps: 19%|█▉ | 291/1500 [03:09<13:01, 1.55it/s, loss=0.2, lr=1] Steps: 19%|█▉ | 291/1500 [03:09<13:01, 1.55it/s, loss=0.271, lr=1] Steps: 19%|█▉ | 292/1500 [03:09<13:00, 1.55it/s, loss=0.271, lr=1] Steps: 19%|█▉ | 292/1500 [03:09<13:00, 1.55it/s, loss=0.197, lr=1] Steps: 20%|█▉ | 293/1500 [03:10<12:59, 1.55it/s, loss=0.197, lr=1] Steps: 20%|█▉ | 293/1500 [03:10<12:59, 1.55it/s, loss=0.171, lr=1] Steps: 20%|█▉ | 294/1500 [03:11<12:58, 1.55it/s, loss=0.171, lr=1] Steps: 20%|█▉ | 294/1500 [03:11<12:58, 1.55it/s, loss=0.0934, lr=1] Steps: 20%|█▉ | 295/1500 [03:11<12:57, 1.55it/s, loss=0.0934, lr=1] Steps: 20%|█▉ | 295/1500 [03:11<12:57, 1.55it/s, loss=0.0422, lr=1] Steps: 20%|█▉ | 296/1500 [03:12<12:56, 1.55it/s, loss=0.0422, lr=1] Steps: 20%|█▉ | 296/1500 [03:12<12:56, 1.55it/s, loss=0.29, lr=1] Steps: 20%|█▉ | 297/1500 [03:13<12:55, 1.55it/s, loss=0.29, lr=1] Steps: 20%|█▉ | 297/1500 [03:13<12:55, 1.55it/s, loss=0.237, lr=1] Steps: 20%|█▉ | 298/1500 [03:13<12:54, 1.55it/s, loss=0.237, lr=1] Steps: 20%|█▉ | 298/1500 [03:13<12:54, 1.55it/s, loss=0.142, lr=1] Steps: 20%|█▉ | 299/1500 [03:14<12:54, 1.55it/s, loss=0.142, lr=1] Steps: 20%|█▉ | 299/1500 [03:14<12:54, 1.55it/s, loss=0.0447, lr=1] Steps: 20%|██ | 300/1500 [03:15<12:53, 1.55it/s, loss=0.0447, lr=1] Steps: 20%|██ | 300/1500 [03:15<12:53, 1.55it/s, loss=0.112, lr=1] Steps: 20%|██ | 301/1500 [03:15<12:52, 1.55it/s, loss=0.112, lr=1] Steps: 20%|██ | 301/1500 [03:15<12:52, 1.55it/s, loss=0.0553, lr=1] Steps: 20%|██ | 302/1500 [03:16<12:53, 1.55it/s, loss=0.0553, lr=1] Steps: 20%|██ | 302/1500 [03:16<12:53, 1.55it/s, loss=0.0361, lr=1] Steps: 20%|██ | 303/1500 [03:16<12:52, 1.55it/s, loss=0.0361, lr=1] Steps: 20%|██ | 303/1500 [03:16<12:52, 1.55it/s, loss=0.0686, lr=1] Steps: 20%|██ | 304/1500 [03:17<12:50, 1.55it/s, loss=0.0686, lr=1] Steps: 20%|██ | 304/1500 [03:17<12:50, 1.55it/s, loss=0.0536, lr=1] Steps: 20%|██ | 305/1500 [03:18<12:54, 1.54it/s, loss=0.0536, lr=1] Steps: 20%|██ | 305/1500 [03:18<12:54, 1.54it/s, loss=0.14, lr=1] Steps: 20%|██ | 306/1500 [03:18<12:53, 1.54it/s, loss=0.14, lr=1] Steps: 20%|██ | 306/1500 [03:18<12:53, 1.54it/s, loss=0.144, lr=1] Steps: 20%|██ | 307/1500 [03:19<12:51, 1.55it/s, loss=0.144, lr=1] Steps: 20%|██ | 307/1500 [03:19<12:51, 1.55it/s, loss=0.102, lr=1] Steps: 21%|██ | 308/1500 [03:20<12:51, 1.54it/s, loss=0.102, lr=1] Steps: 21%|██ | 308/1500 [03:20<12:51, 1.54it/s, loss=0.236, lr=1] Steps: 21%|██ | 309/1500 [03:20<12:50, 1.54it/s, loss=0.236, lr=1] Steps: 21%|██ | 309/1500 [03:20<12:50, 1.54it/s, loss=0.0863, lr=1] Steps: 21%|██ | 310/1500 [03:21<12:49, 1.55it/s, loss=0.0863, lr=1] Steps: 21%|██ | 310/1500 [03:21<12:49, 1.55it/s, loss=0.11, lr=1] Steps: 21%|██ | 311/1500 [03:22<12:47, 1.55it/s, loss=0.11, lr=1] Steps: 21%|██ | 311/1500 [03:22<12:47, 1.55it/s, loss=0.263, lr=1] Steps: 21%|██ | 312/1500 [03:22<12:46, 1.55it/s, loss=0.263, lr=1] Steps: 21%|██ | 312/1500 [03:22<12:46, 1.55it/s, loss=0.232, lr=1] Steps: 21%|██ | 313/1500 [03:23<12:46, 1.55it/s, loss=0.232, lr=1] Steps: 21%|██ | 313/1500 [03:23<12:46, 1.55it/s, loss=0.126, lr=1] Steps: 21%|██ | 314/1500 [03:24<12:44, 1.55it/s, loss=0.126, lr=1] Steps: 21%|██ | 314/1500 [03:24<12:44, 1.55it/s, loss=0.0702, lr=1] Steps: 21%|██ | 315/1500 [03:24<12:44, 1.55it/s, loss=0.0702, lr=1] Steps: 21%|██ | 315/1500 [03:24<12:44, 1.55it/s, loss=0.125, lr=1] Steps: 21%|██ | 316/1500 [03:25<12:43, 1.55it/s, loss=0.125, lr=1] Steps: 21%|██ | 316/1500 [03:25<12:43, 1.55it/s, loss=0.131, lr=1] Steps: 21%|██ | 317/1500 [03:26<12:42, 1.55it/s, loss=0.131, lr=1] Steps: 21%|██ | 317/1500 [03:26<12:42, 1.55it/s, loss=0.0772, lr=1] Steps: 21%|██ | 318/1500 [03:26<12:43, 1.55it/s, loss=0.0772, lr=1] Steps: 21%|██ | 318/1500 [03:26<12:43, 1.55it/s, loss=0.294, lr=1] Steps: 21%|██▏ | 319/1500 [03:27<12:43, 1.55it/s, loss=0.294, lr=1] Steps: 21%|██▏ | 319/1500 [03:27<12:43, 1.55it/s, loss=0.0634, lr=1] Steps: 21%|██▏ | 320/1500 [03:27<12:43, 1.54it/s, loss=0.0634, lr=1] Steps: 21%|██▏ | 320/1500 [03:27<12:43, 1.54it/s, loss=0.0562, lr=1] Steps: 21%|██▏ | 321/1500 [03:28<12:48, 1.54it/s, loss=0.0562, lr=1] Steps: 21%|██▏ | 321/1500 [03:28<12:48, 1.54it/s, loss=0.169, lr=1] Steps: 21%|██▏ | 322/1500 [03:29<12:46, 1.54it/s, loss=0.169, lr=1] Steps: 21%|██▏ | 322/1500 [03:29<12:46, 1.54it/s, loss=0.262, lr=1] Steps: 22%|██▏ | 323/1500 [03:29<12:44, 1.54it/s, loss=0.262, lr=1] Steps: 22%|██▏ | 323/1500 [03:29<12:44, 1.54it/s, loss=0.128, lr=1] Steps: 22%|██▏ | 324/1500 [03:30<12:42, 1.54it/s, loss=0.128, lr=1] Steps: 22%|██▏ | 324/1500 [03:30<12:42, 1.54it/s, loss=0.0749, lr=1] Steps: 22%|██▏ | 325/1500 [03:31<12:41, 1.54it/s, loss=0.0749, lr=1] Steps: 22%|██▏ | 325/1500 [03:31<12:41, 1.54it/s, loss=0.141, lr=1] Steps: 22%|██▏ | 326/1500 [03:31<12:43, 1.54it/s, loss=0.141, lr=1] Steps: 22%|██▏ | 326/1500 [03:31<12:43, 1.54it/s, loss=0.0817, lr=1] Steps: 22%|██▏ | 327/1500 [03:32<12:41, 1.54it/s, loss=0.0817, lr=1] Steps: 22%|██▏ | 327/1500 [03:32<12:41, 1.54it/s, loss=0.128, lr=1] Steps: 22%|██▏ | 328/1500 [03:33<12:39, 1.54it/s, loss=0.128, lr=1] Steps: 22%|██▏ | 328/1500 [03:33<12:39, 1.54it/s, loss=0.0994, lr=1] Steps: 22%|██▏ | 329/1500 [03:33<12:38, 1.54it/s, loss=0.0994, lr=1] Steps: 22%|██▏ | 329/1500 [03:33<12:38, 1.54it/s, loss=0.192, lr=1] Steps: 22%|██▏ | 330/1500 [03:34<12:38, 1.54it/s, loss=0.192, lr=1] Steps: 22%|██▏ | 330/1500 [03:34<12:38, 1.54it/s, loss=0.0226, lr=1] Steps: 22%|██▏ | 331/1500 [03:35<12:37, 1.54it/s, loss=0.0226, lr=1] Steps: 22%|██▏ | 331/1500 [03:35<12:37, 1.54it/s, loss=0.143, lr=1] Steps: 22%|██▏ | 332/1500 [03:35<12:36, 1.54it/s, loss=0.143, lr=1] Steps: 22%|██▏ | 332/1500 [03:35<12:36, 1.54it/s, loss=0.099, lr=1] Steps: 22%|██▏ | 333/1500 [03:36<12:36, 1.54it/s, loss=0.099, lr=1] Steps: 22%|██▏ | 333/1500 [03:36<12:36, 1.54it/s, loss=0.089, lr=1] Steps: 22%|██▏ | 334/1500 [03:37<12:34, 1.54it/s, loss=0.089, lr=1] Steps: 22%|██▏ | 334/1500 [03:37<12:34, 1.54it/s, loss=0.148, lr=1] Steps: 22%|██▏ | 335/1500 [03:37<12:34, 1.54it/s, loss=0.148, lr=1] Steps: 22%|██▏ | 335/1500 [03:37<12:34, 1.54it/s, loss=0.076, lr=1] Steps: 22%|██▏ | 336/1500 [03:38<12:32, 1.55it/s, loss=0.076, lr=1] Steps: 22%|██▏ | 336/1500 [03:38<12:32, 1.55it/s, loss=0.0491, lr=1] Steps: 22%|██▏ | 337/1500 [03:39<12:36, 1.54it/s, loss=0.0491, lr=1] Steps: 22%|██▏ | 337/1500 [03:39<12:36, 1.54it/s, loss=0.143, lr=1] Steps: 23%|██▎ | 338/1500 [03:39<12:34, 1.54it/s, loss=0.143, lr=1] Steps: 23%|██▎ | 338/1500 [03:39<12:34, 1.54it/s, loss=0.111, lr=1] Steps: 23%|██▎ | 339/1500 [03:40<12:33, 1.54it/s, loss=0.111, lr=1] Steps: 23%|██▎ | 339/1500 [03:40<12:33, 1.54it/s, loss=0.141, lr=1] Steps: 23%|██▎ | 340/1500 [03:40<12:32, 1.54it/s, loss=0.141, lr=1] Steps: 23%|██▎ | 340/1500 [03:40<12:32, 1.54it/s, loss=0.0985, lr=1] Steps: 23%|██▎ | 341/1500 [03:41<12:31, 1.54it/s, loss=0.0985, lr=1] Steps: 23%|██▎ | 341/1500 [03:41<12:31, 1.54it/s, loss=0.035, lr=1] Steps: 23%|██▎ | 342/1500 [03:42<12:30, 1.54it/s, loss=0.035, lr=1] Steps: 23%|██▎ | 342/1500 [03:42<12:30, 1.54it/s, loss=0.171, lr=1] Steps: 23%|██▎ | 343/1500 [03:42<12:30, 1.54it/s, loss=0.171, lr=1] Steps: 23%|██▎ | 343/1500 [03:42<12:30, 1.54it/s, loss=0.0746, lr=1] Steps: 23%|██▎ | 344/1500 [03:43<12:28, 1.54it/s, loss=0.0746, lr=1] Steps: 23%|██▎ | 344/1500 [03:43<12:28, 1.54it/s, loss=0.029, lr=1] Steps: 23%|██▎ | 345/1500 [03:44<12:29, 1.54it/s, loss=0.029, lr=1] Steps: 23%|██▎ | 345/1500 [03:44<12:29, 1.54it/s, loss=0.127, lr=1] Steps: 23%|██▎ | 346/1500 [03:44<12:33, 1.53it/s, loss=0.127, lr=1] Steps: 23%|██▎ | 346/1500 [03:44<12:33, 1.53it/s, loss=0.242, lr=1] Steps: 23%|██▎ | 347/1500 [03:45<12:31, 1.53it/s, loss=0.242, lr=1] Steps: 23%|██▎ | 347/1500 [03:45<12:31, 1.53it/s, loss=0.217, lr=1] Steps: 23%|██▎ | 348/1500 [03:46<12:28, 1.54it/s, loss=0.217, lr=1] Steps: 23%|██▎ | 348/1500 [03:46<12:28, 1.54it/s, loss=0.155, lr=1] Steps: 23%|██▎ | 349/1500 [03:46<12:27, 1.54it/s, loss=0.155, lr=1] Steps: 23%|██▎ | 349/1500 [03:46<12:27, 1.54it/s, loss=0.0558, lr=1] Steps: 23%|██▎ | 350/1500 [03:47<12:25, 1.54it/s, loss=0.0558, lr=1] Steps: 23%|██▎ | 350/1500 [03:47<12:25, 1.54it/s, loss=0.328, lr=1] Steps: 23%|██▎ | 351/1500 [03:48<12:22, 1.55it/s, loss=0.328, lr=1] Steps: 23%|██▎ | 351/1500 [03:48<12:22, 1.55it/s, loss=0.18, lr=1] Steps: 23%|██▎ | 352/1500 [03:48<12:21, 1.55it/s, loss=0.18, lr=1] Steps: 23%|██▎ | 352/1500 [03:48<12:21, 1.55it/s, loss=0.11, lr=1] Steps: 24%|██▎ | 353/1500 [03:49<12:24, 1.54it/s, loss=0.11, lr=1] Steps: 24%|██▎ | 353/1500 [03:49<12:24, 1.54it/s, loss=0.0542, lr=1] Steps: 24%|██▎ | 354/1500 [03:50<12:22, 1.54it/s, loss=0.0542, lr=1] Steps: 24%|██▎ | 354/1500 [03:50<12:22, 1.54it/s, loss=0.166, lr=1] Steps: 24%|██▎ | 355/1500 [03:50<12:21, 1.54it/s, loss=0.166, lr=1] Steps: 24%|██▎ | 355/1500 [03:50<12:21, 1.54it/s, loss=0.0277, lr=1] Steps: 24%|██▎ | 356/1500 [03:51<12:21, 1.54it/s, loss=0.0277, lr=1] Steps: 24%|██▎ | 356/1500 [03:51<12:21, 1.54it/s, loss=0.149, lr=1] Steps: 24%|██▍ | 357/1500 [03:51<12:19, 1.55it/s, loss=0.149, lr=1] Steps: 24%|██▍ | 357/1500 [03:51<12:19, 1.55it/s, loss=0.023, lr=1] Steps: 24%|██▍ | 358/1500 [03:52<12:17, 1.55it/s, loss=0.023, lr=1] Steps: 24%|██▍ | 358/1500 [03:52<12:17, 1.55it/s, loss=0.143, lr=1] Steps: 24%|██▍ | 359/1500 [03:53<12:17, 1.55it/s, loss=0.143, lr=1] Steps: 24%|██▍ | 359/1500 [03:53<12:17, 1.55it/s, loss=0.11, lr=1] Steps: 24%|██▍ | 360/1500 [03:53<12:16, 1.55it/s, loss=0.11, lr=1] Steps: 24%|██▍ | 360/1500 [03:53<12:16, 1.55it/s, loss=0.45, lr=1] Steps: 24%|██▍ | 361/1500 [03:54<12:15, 1.55it/s, loss=0.45, lr=1] Steps: 24%|██▍ | 361/1500 [03:54<12:15, 1.55it/s, loss=0.159, lr=1] Steps: 24%|██▍ | 362/1500 [03:55<12:15, 1.55it/s, loss=0.159, lr=1] Steps: 24%|██▍ | 362/1500 [03:55<12:15, 1.55it/s, loss=0.218, lr=1] Steps: 24%|██▍ | 363/1500 [03:55<12:14, 1.55it/s, loss=0.218, lr=1] Steps: 24%|██▍ | 363/1500 [03:55<12:14, 1.55it/s, loss=0.228, lr=1] Steps: 24%|██▍ | 364/1500 [03:56<12:15, 1.54it/s, loss=0.228, lr=1] Steps: 24%|██▍ | 364/1500 [03:56<12:15, 1.54it/s, loss=0.164, lr=1] Steps: 24%|██▍ | 365/1500 [03:57<12:13, 1.55it/s, loss=0.164, lr=1] Steps: 24%|██▍ | 365/1500 [03:57<12:13, 1.55it/s, loss=0.151, lr=1] Steps: 24%|██▍ | 366/1500 [03:57<12:13, 1.55it/s, loss=0.151, lr=1] Steps: 24%|██▍ | 366/1500 [03:57<12:13, 1.55it/s, loss=0.188, lr=1] Steps: 24%|██▍ | 367/1500 [03:58<12:14, 1.54it/s, loss=0.188, lr=1] Steps: 24%|██▍ | 367/1500 [03:58<12:14, 1.54it/s, loss=0.0827, lr=1] Steps: 25%|██▍ | 368/1500 [03:59<12:12, 1.55it/s, loss=0.0827, lr=1] Steps: 25%|██▍ | 368/1500 [03:59<12:12, 1.55it/s, loss=0.0621, lr=1] Steps: 25%|██▍ | 369/1500 [03:59<12:15, 1.54it/s, loss=0.0621, lr=1] Steps: 25%|██▍ | 369/1500 [03:59<12:15, 1.54it/s, loss=0.151, lr=1] Steps: 25%|██▍ | 370/1500 [04:00<12:14, 1.54it/s, loss=0.151, lr=1] Steps: 25%|██▍ | 370/1500 [04:00<12:14, 1.54it/s, loss=0.256, lr=1] Steps: 25%|██▍ | 371/1500 [04:01<12:11, 1.54it/s, loss=0.256, lr=1] Steps: 25%|██▍ | 371/1500 [04:01<12:11, 1.54it/s, loss=0.176, lr=1] Steps: 25%|██▍ | 372/1500 [04:01<12:09, 1.55it/s, loss=0.176, lr=1] Steps: 25%|██▍ | 372/1500 [04:01<12:09, 1.55it/s, loss=0.113, lr=1] Steps: 25%|██▍ | 373/1500 [04:02<12:09, 1.54it/s, loss=0.113, lr=1] Steps: 25%|██▍ | 373/1500 [04:02<12:09, 1.54it/s, loss=0.308, lr=1] Steps: 25%|██▍ | 374/1500 [04:02<12:09, 1.54it/s, loss=0.308, lr=1] Steps: 25%|██▍ | 374/1500 [04:02<12:09, 1.54it/s, loss=0.155, lr=1] Steps: 25%|██▌ | 375/1500 [04:03<12:07, 1.55it/s, loss=0.155, lr=1] Steps: 25%|██▌ | 375/1500 [04:03<12:07, 1.55it/s, loss=0.259, lr=1] Steps: 25%|██▌ | 376/1500 [04:04<12:08, 1.54it/s, loss=0.259, lr=1] Steps: 25%|██▌ | 376/1500 [04:04<12:08, 1.54it/s, loss=0.0988, lr=1] Steps: 25%|██▌ | 377/1500 [04:04<12:07, 1.54it/s, loss=0.0988, lr=1] Steps: 25%|██▌ | 377/1500 [04:04<12:07, 1.54it/s, loss=0.135, lr=1] Steps: 25%|██▌ | 378/1500 [04:05<12:06, 1.55it/s, loss=0.135, lr=1] Steps: 25%|██▌ | 378/1500 [04:05<12:06, 1.55it/s, loss=0.295, lr=1] Steps: 25%|██▌ | 379/1500 [04:06<12:05, 1.54it/s, loss=0.295, lr=1] Steps: 25%|██▌ | 379/1500 [04:06<12:05, 1.54it/s, loss=0.191, lr=1] Steps: 25%|██▌ | 380/1500 [04:06<12:04, 1.55it/s, loss=0.191, lr=1] Steps: 25%|██▌ | 380/1500 [04:06<12:04, 1.55it/s, loss=0.0877, lr=1] Steps: 25%|██▌ | 381/1500 [04:07<12:02, 1.55it/s, loss=0.0877, lr=1] Steps: 25%|██▌ | 381/1500 [04:07<12:02, 1.55it/s, loss=0.118, lr=1] Steps: 25%|██▌ | 382/1500 [04:08<12:01, 1.55it/s, loss=0.118, lr=1] Steps: 25%|██▌ | 382/1500 [04:08<12:01, 1.55it/s, loss=0.215, lr=1] Steps: 26%|██▌ | 383/1500 [04:08<12:01, 1.55it/s, loss=0.215, lr=1] Steps: 26%|██▌ | 383/1500 [04:08<12:01, 1.55it/s, loss=0.157, lr=1] Steps: 26%|██▌ | 384/1500 [04:09<12:00, 1.55it/s, loss=0.157, lr=1] Steps: 26%|██▌ | 384/1500 [04:09<12:00, 1.55it/s, loss=0.177, lr=1] Steps: 26%|██▌ | 385/1500 [04:10<12:03, 1.54it/s, loss=0.177, lr=1] Steps: 26%|██▌ | 385/1500 [04:10<12:03, 1.54it/s, loss=0.204, lr=1] Steps: 26%|██▌ | 386/1500 [04:10<12:01, 1.54it/s, loss=0.204, lr=1] Steps: 26%|██▌ | 386/1500 [04:10<12:01, 1.54it/s, loss=0.131, lr=1] Steps: 26%|██▌ | 387/1500 [04:11<11:59, 1.55it/s, loss=0.131, lr=1] Steps: 26%|██▌ | 387/1500 [04:11<11:59, 1.55it/s, loss=0.184, lr=1] Steps: 26%|██▌ | 388/1500 [04:12<11:57, 1.55it/s, loss=0.184, lr=1] Steps: 26%|██▌ | 388/1500 [04:12<11:57, 1.55it/s, loss=0.0585, lr=1] Steps: 26%|██▌ | 389/1500 [04:12<11:56, 1.55it/s, loss=0.0585, lr=1] Steps: 26%|██▌ | 389/1500 [04:12<11:56, 1.55it/s, loss=0.182, lr=1] Steps: 26%|██▌ | 390/1500 [04:13<11:56, 1.55it/s, loss=0.182, lr=1] Steps: 26%|██▌ | 390/1500 [04:13<11:56, 1.55it/s, loss=0.0418, lr=1] Steps: 26%|██▌ | 391/1500 [04:13<11:54, 1.55it/s, loss=0.0418, lr=1] Steps: 26%|██▌ | 391/1500 [04:13<11:54, 1.55it/s, loss=0.104, lr=1] Steps: 26%|██▌ | 392/1500 [04:14<11:53, 1.55it/s, loss=0.104, lr=1] Steps: 26%|██▌ | 392/1500 [04:14<11:53, 1.55it/s, loss=0.0842, lr=1] Steps: 26%|██▌ | 393/1500 [04:15<11:52, 1.55it/s, loss=0.0842, lr=1] Steps: 26%|██▌ | 393/1500 [04:15<11:52, 1.55it/s, loss=0.0876, lr=1] Steps: 26%|██▋ | 394/1500 [04:15<11:52, 1.55it/s, loss=0.0876, lr=1] Steps: 26%|██▋ | 394/1500 [04:15<11:52, 1.55it/s, loss=0.183, lr=1] Steps: 26%|██▋ | 395/1500 [04:16<11:52, 1.55it/s, loss=0.183, lr=1] Steps: 26%|██▋ | 395/1500 [04:16<11:52, 1.55it/s, loss=0.11, lr=1] Steps: 26%|██▋ | 396/1500 [04:17<11:53, 1.55it/s, loss=0.11, lr=1] Steps: 26%|██▋ | 396/1500 [04:17<11:53, 1.55it/s, loss=0.109, lr=1] Steps: 26%|██▋ | 397/1500 [04:17<11:52, 1.55it/s, loss=0.109, lr=1] Steps: 26%|██▋ | 397/1500 [04:17<11:52, 1.55it/s, loss=0.0939, lr=1] Steps: 27%|██▋ | 398/1500 [04:18<11:51, 1.55it/s, loss=0.0939, lr=1] Steps: 27%|██▋ | 398/1500 [04:18<11:51, 1.55it/s, loss=0.0673, lr=1] Steps: 27%|██▋ | 399/1500 [04:19<11:51, 1.55it/s, loss=0.0673, lr=1] Steps: 27%|██▋ | 399/1500 [04:19<11:51, 1.55it/s, loss=0.0303, lr=1] Steps: 27%|██▋ | 400/1500 [04:19<11:49, 1.55it/s, loss=0.0303, lr=1] Steps: 27%|██▋ | 400/1500 [04:19<11:49, 1.55it/s, loss=0.0659, lr=1] Steps: 27%|██▋ | 401/1500 [04:20<11:54, 1.54it/s, loss=0.0659, lr=1] Steps: 27%|██▋ | 401/1500 [04:20<11:54, 1.54it/s, loss=0.103, lr=1] Steps: 27%|██▋ | 402/1500 [04:21<11:52, 1.54it/s, loss=0.103, lr=1] Steps: 27%|██▋ | 402/1500 [04:21<11:52, 1.54it/s, loss=0.0599, lr=1] Steps: 27%|██▋ | 403/1500 [04:21<11:52, 1.54it/s, loss=0.0599, lr=1] Steps: 27%|██▋ | 403/1500 [04:21<11:52, 1.54it/s, loss=0.16, lr=1] Steps: 27%|██▋ | 404/1500 [04:22<11:50, 1.54it/s, loss=0.16, lr=1] Steps: 27%|██▋ | 404/1500 [04:22<11:50, 1.54it/s, loss=0.0198, lr=1] Steps: 27%|██▋ | 405/1500 [04:23<11:48, 1.55it/s, loss=0.0198, lr=1] Steps: 27%|██▋ | 405/1500 [04:23<11:48, 1.55it/s, loss=0.0182, lr=1] Steps: 27%|██▋ | 406/1500 [04:23<11:46, 1.55it/s, loss=0.0182, lr=1] Steps: 27%|██▋ | 406/1500 [04:23<11:46, 1.55it/s, loss=0.0636, lr=1] Steps: 27%|██▋ | 407/1500 [04:24<11:46, 1.55it/s, loss=0.0636, lr=1] Steps: 27%|██▋ | 407/1500 [04:24<11:46, 1.55it/s, loss=0.114, lr=1] Steps: 27%|██▋ | 408/1500 [04:24<11:45, 1.55it/s, loss=0.114, lr=1] Steps: 27%|██▋ | 408/1500 [04:24<11:45, 1.55it/s, loss=0.0698, lr=1] Steps: 27%|██▋ | 409/1500 [04:25<11:44, 1.55it/s, loss=0.0698, lr=1] Steps: 27%|██▋ | 409/1500 [04:25<11:44, 1.55it/s, loss=0.153, lr=1] Steps: 27%|██▋ | 410/1500 [04:26<11:43, 1.55it/s, loss=0.153, lr=1] Steps: 27%|██▋ | 410/1500 [04:26<11:43, 1.55it/s, loss=0.118, lr=1] Steps: 27%|██▋ | 411/1500 [04:26<11:42, 1.55it/s, loss=0.118, lr=1] Steps: 27%|██▋ | 411/1500 [04:26<11:42, 1.55it/s, loss=0.139, lr=1] Steps: 27%|██▋ | 412/1500 [04:27<11:42, 1.55it/s, loss=0.139, lr=1] Steps: 27%|██▋ | 412/1500 [04:27<11:42, 1.55it/s, loss=0.143, lr=1] Steps: 28%|██▊ | 413/1500 [04:28<11:41, 1.55it/s, loss=0.143, lr=1] Steps: 28%|██▊ | 413/1500 [04:28<11:41, 1.55it/s, loss=0.3, lr=1] Steps: 28%|██▊ | 414/1500 [04:28<11:40, 1.55it/s, loss=0.3, lr=1] Steps: 28%|██▊ | 414/1500 [04:28<11:40, 1.55it/s, loss=0.0971, lr=1] Steps: 28%|██▊ | 415/1500 [04:29<11:40, 1.55it/s, loss=0.0971, lr=1] Steps: 28%|██▊ | 415/1500 [04:29<11:40, 1.55it/s, loss=0.135, lr=1] Steps: 28%|██▊ | 416/1500 [04:30<11:39, 1.55it/s, loss=0.135, lr=1] Steps: 28%|██▊ | 416/1500 [04:30<11:39, 1.55it/s, loss=0.0597, lr=1] Steps: 28%|██▊ | 417/1500 [04:30<11:42, 1.54it/s, loss=0.0597, lr=1] Steps: 28%|██▊ | 417/1500 [04:30<11:42, 1.54it/s, loss=0.113, lr=1] Steps: 28%|██▊ | 418/1500 [04:31<11:40, 1.55it/s, loss=0.113, lr=1] Steps: 28%|██▊ | 418/1500 [04:31<11:40, 1.55it/s, loss=0.125, lr=1] Steps: 28%|██▊ | 419/1500 [04:32<11:38, 1.55it/s, loss=0.125, lr=1] Steps: 28%|██▊ | 419/1500 [04:32<11:38, 1.55it/s, loss=0.075, lr=1] Steps: 28%|██▊ | 420/1500 [04:32<11:36, 1.55it/s, loss=0.075, lr=1] Steps: 28%|██▊ | 420/1500 [04:32<11:36, 1.55it/s, loss=0.159, lr=1] Steps: 28%|██▊ | 421/1500 [04:33<11:35, 1.55it/s, loss=0.159, lr=1] Steps: 28%|██▊ | 421/1500 [04:33<11:35, 1.55it/s, loss=0.15, lr=1] Steps: 28%|██▊ | 422/1500 [04:33<11:34, 1.55it/s, loss=0.15, lr=1] Steps: 28%|██▊ | 422/1500 [04:33<11:34, 1.55it/s, loss=0.099, lr=1] Steps: 28%|██▊ | 423/1500 [04:34<11:33, 1.55it/s, loss=0.099, lr=1] Steps: 28%|██▊ | 423/1500 [04:34<11:33, 1.55it/s, loss=0.156, lr=1] Steps: 28%|██▊ | 424/1500 [04:35<11:33, 1.55it/s, loss=0.156, lr=1] Steps: 28%|██▊ | 424/1500 [04:35<11:33, 1.55it/s, loss=0.0727, lr=1] Steps: 28%|██▊ | 425/1500 [04:35<11:34, 1.55it/s, loss=0.0727, lr=1] Steps: 28%|██▊ | 425/1500 [04:35<11:34, 1.55it/s, loss=0.143, lr=1] Steps: 28%|██▊ | 426/1500 [04:36<11:33, 1.55it/s, loss=0.143, lr=1] Steps: 28%|██▊ | 426/1500 [04:36<11:33, 1.55it/s, loss=0.113, lr=1] Steps: 28%|██▊ | 427/1500 [04:37<11:32, 1.55it/s, loss=0.113, lr=1] Steps: 28%|██▊ | 427/1500 [04:37<11:32, 1.55it/s, loss=0.209, lr=1] Steps: 29%|██▊ | 428/1500 [04:37<11:31, 1.55it/s, loss=0.209, lr=1] Steps: 29%|██▊ | 428/1500 [04:37<11:31, 1.55it/s, loss=0.438, lr=1] Steps: 29%|██▊ | 429/1500 [04:38<11:30, 1.55it/s, loss=0.438, lr=1] Steps: 29%|██▊ | 429/1500 [04:38<11:30, 1.55it/s, loss=0.169, lr=1] Steps: 29%|██▊ | 430/1500 [04:39<11:29, 1.55it/s, loss=0.169, lr=1] Steps: 29%|██▊ | 430/1500 [04:39<11:29, 1.55it/s, loss=0.0719, lr=1] Steps: 29%|██▊ | 431/1500 [04:39<11:28, 1.55it/s, loss=0.0719, lr=1] Steps: 29%|██▊ | 431/1500 [04:39<11:28, 1.55it/s, loss=0.166, lr=1] Steps: 29%|██▉ | 432/1500 [04:40<11:27, 1.55it/s, loss=0.166, lr=1] Steps: 29%|██▉ | 432/1500 [04:40<11:27, 1.55it/s, loss=0.184, lr=1] Steps: 29%|██▉ | 433/1500 [04:41<11:31, 1.54it/s, loss=0.184, lr=1] Steps: 29%|██▉ | 433/1500 [04:41<11:31, 1.54it/s, loss=0.16, lr=1] Steps: 29%|██▉ | 434/1500 [04:41<11:29, 1.55it/s, loss=0.16, lr=1] Steps: 29%|██▉ | 434/1500 [04:41<11:29, 1.55it/s, loss=0.348, lr=1] Steps: 29%|██▉ | 435/1500 [04:42<11:28, 1.55it/s, loss=0.348, lr=1] Steps: 29%|██▉ | 435/1500 [04:42<11:28, 1.55it/s, loss=0.141, lr=1] Steps: 29%|██▉ | 436/1500 [04:43<11:26, 1.55it/s, loss=0.141, lr=1] Steps: 29%|██▉ | 436/1500 [04:43<11:26, 1.55it/s, loss=0.101, lr=1] Steps: 29%|██▉ | 437/1500 [04:43<11:25, 1.55it/s, loss=0.101, lr=1] Steps: 29%|██▉ | 437/1500 [04:43<11:25, 1.55it/s, loss=0.34, lr=1] Steps: 29%|██▉ | 438/1500 [04:44<11:26, 1.55it/s, loss=0.34, lr=1] Steps: 29%|██▉ | 438/1500 [04:44<11:26, 1.55it/s, loss=0.202, lr=1] Steps: 29%|██▉ | 439/1500 [04:44<11:24, 1.55it/s, loss=0.202, lr=1] Steps: 29%|██▉ | 439/1500 [04:44<11:24, 1.55it/s, loss=0.178, lr=1] Steps: 29%|██▉ | 440/1500 [04:45<11:22, 1.55it/s, loss=0.178, lr=1] Steps: 29%|██▉ | 440/1500 [04:45<11:22, 1.55it/s, loss=0.0855, lr=1] Steps: 29%|██▉ | 441/1500 [04:46<11:22, 1.55it/s, loss=0.0855, lr=1] Steps: 29%|██▉ | 441/1500 [04:46<11:22, 1.55it/s, loss=0.123, lr=1] Steps: 29%|██▉ | 442/1500 [04:46<11:22, 1.55it/s, loss=0.123, lr=1] Steps: 29%|██▉ | 442/1500 [04:46<11:22, 1.55it/s, loss=0.0254, lr=1] Steps: 30%|██▉ | 443/1500 [04:47<11:22, 1.55it/s, loss=0.0254, lr=1] Steps: 30%|██▉ | 443/1500 [04:47<11:22, 1.55it/s, loss=0.168, lr=1] Steps: 30%|██▉ | 444/1500 [04:48<11:21, 1.55it/s, loss=0.168, lr=1] Steps: 30%|██▉ | 444/1500 [04:48<11:21, 1.55it/s, loss=0.0671, lr=1] Steps: 30%|██▉ | 445/1500 [04:48<11:20, 1.55it/s, loss=0.0671, lr=1] Steps: 30%|██▉ | 445/1500 [04:48<11:20, 1.55it/s, loss=0.112, lr=1] Steps: 30%|██▉ | 446/1500 [04:49<11:19, 1.55it/s, loss=0.112, lr=1] Steps: 30%|██▉ | 446/1500 [04:49<11:19, 1.55it/s, loss=0.246, lr=1] Steps: 30%|██▉ | 447/1500 [04:50<11:18, 1.55it/s, loss=0.246, lr=1] Steps: 30%|██▉ | 447/1500 [04:50<11:18, 1.55it/s, loss=0.496, lr=1] Steps: 30%|██▉ | 448/1500 [04:50<11:17, 1.55it/s, loss=0.496, lr=1] Steps: 30%|██▉ | 448/1500 [04:50<11:17, 1.55it/s, loss=0.137, lr=1] Steps: 30%|██▉ | 449/1500 [04:51<11:21, 1.54it/s, loss=0.137, lr=1] Steps: 30%|██▉ | 449/1500 [04:51<11:21, 1.54it/s, loss=0.157, lr=1] Steps: 30%|███ | 450/1500 [04:52<11:19, 1.55it/s, loss=0.157, lr=1] Steps: 30%|███ | 450/1500 [04:52<11:19, 1.55it/s, loss=0.0549, lr=1] Steps: 30%|███ | 451/1500 [04:52<11:17, 1.55it/s, loss=0.0549, lr=1] Steps: 30%|███ | 451/1500 [04:52<11:17, 1.55it/s, loss=0.106, lr=1] Steps: 30%|███ | 452/1500 [04:53<11:16, 1.55it/s, loss=0.106, lr=1] Steps: 30%|███ | 452/1500 [04:53<11:16, 1.55it/s, loss=0.134, lr=1] Steps: 30%|███ | 453/1500 [04:53<11:14, 1.55it/s, loss=0.134, lr=1] Steps: 30%|███ | 453/1500 [04:53<11:14, 1.55it/s, loss=0.11, lr=1] Steps: 30%|███ | 454/1500 [04:54<11:13, 1.55it/s, loss=0.11, lr=1] Steps: 30%|███ | 454/1500 [04:54<11:13, 1.55it/s, loss=0.135, lr=1] Steps: 30%|███ | 455/1500 [04:55<11:12, 1.55it/s, loss=0.135, lr=1] Steps: 30%|███ | 455/1500 [04:55<11:12, 1.55it/s, loss=0.218, lr=1] Steps: 30%|███ | 456/1500 [04:55<11:12, 1.55it/s, loss=0.218, lr=1] Steps: 30%|███ | 456/1500 [04:55<11:12, 1.55it/s, loss=0.127, lr=1] Steps: 30%|███ | 457/1500 [04:56<11:12, 1.55it/s, loss=0.127, lr=1] Steps: 30%|███ | 457/1500 [04:56<11:12, 1.55it/s, loss=0.118, lr=1] Steps: 31%|███ | 458/1500 [04:57<11:12, 1.55it/s, loss=0.118, lr=1] Steps: 31%|███ | 458/1500 [04:57<11:12, 1.55it/s, loss=0.0756, lr=1] Steps: 31%|███ | 459/1500 [04:57<11:11, 1.55it/s, loss=0.0756, lr=1] Steps: 31%|███ | 459/1500 [04:57<11:11, 1.55it/s, loss=0.135, lr=1] Steps: 31%|███ | 460/1500 [04:58<11:10, 1.55it/s, loss=0.135, lr=1] Steps: 31%|███ | 460/1500 [04:58<11:10, 1.55it/s, loss=0.093, lr=1] Steps: 31%|███ | 461/1500 [04:59<11:09, 1.55it/s, loss=0.093, lr=1] Steps: 31%|███ | 461/1500 [04:59<11:09, 1.55it/s, loss=0.128, lr=1] Steps: 31%|███ | 462/1500 [04:59<11:09, 1.55it/s, loss=0.128, lr=1] Steps: 31%|███ | 462/1500 [04:59<11:09, 1.55it/s, loss=0.0816, lr=1] Steps: 31%|███ | 463/1500 [05:00<11:08, 1.55it/s, loss=0.0816, lr=1] Steps: 31%|███ | 463/1500 [05:00<11:08, 1.55it/s, loss=0.118, lr=1] Steps: 31%|███ | 464/1500 [05:01<11:07, 1.55it/s, loss=0.118, lr=1] Steps: 31%|███ | 464/1500 [05:01<11:07, 1.55it/s, loss=0.0988, lr=1] Steps: 31%|███ | 465/1500 [05:01<11:10, 1.54it/s, loss=0.0988, lr=1] Steps: 31%|███ | 465/1500 [05:01<11:10, 1.54it/s, loss=0.088, lr=1] Steps: 31%|███ | 466/1500 [05:02<11:08, 1.55it/s, loss=0.088, lr=1] Steps: 31%|███ | 466/1500 [05:02<11:08, 1.55it/s, loss=0.0776, lr=1] Steps: 31%|███ | 467/1500 [05:03<11:07, 1.55it/s, loss=0.0776, lr=1] Steps: 31%|███ | 467/1500 [05:03<11:07, 1.55it/s, loss=0.0928, lr=1] Steps: 31%|███ | 468/1500 [05:03<11:05, 1.55it/s, loss=0.0928, lr=1] Steps: 31%|███ | 468/1500 [05:03<11:05, 1.55it/s, loss=0.0588, lr=1] Steps: 31%|███▏ | 469/1500 [05:04<11:05, 1.55it/s, loss=0.0588, lr=1] Steps: 31%|███▏ | 469/1500 [05:04<11:05, 1.55it/s, loss=0.172, lr=1] Steps: 31%|███▏ | 470/1500 [05:04<11:04, 1.55it/s, loss=0.172, lr=1] Steps: 31%|███▏ | 470/1500 [05:04<11:04, 1.55it/s, loss=0.0969, lr=1] Steps: 31%|███▏ | 471/1500 [05:05<11:02, 1.55it/s, loss=0.0969, lr=1] Steps: 31%|███▏ | 471/1500 [05:05<11:02, 1.55it/s, loss=0.106, lr=1] Steps: 31%|███▏ | 472/1500 [05:06<11:02, 1.55it/s, loss=0.106, lr=1] Steps: 31%|███▏ | 472/1500 [05:06<11:02, 1.55it/s, loss=0.172, lr=1] Steps: 32%|███▏ | 473/1500 [05:06<11:01, 1.55it/s, loss=0.172, lr=1] Steps: 32%|███▏ | 473/1500 [05:06<11:01, 1.55it/s, loss=0.074, lr=1] Steps: 32%|███▏ | 474/1500 [05:07<11:01, 1.55it/s, loss=0.074, lr=1] Steps: 32%|███▏ | 474/1500 [05:07<11:01, 1.55it/s, loss=0.255, lr=1] Steps: 32%|███▏ | 475/1500 [05:08<11:00, 1.55it/s, loss=0.255, lr=1] Steps: 32%|███▏ | 475/1500 [05:08<11:00, 1.55it/s, loss=0.166, lr=1] Steps: 32%|███▏ | 476/1500 [05:08<11:00, 1.55it/s, loss=0.166, lr=1] Steps: 32%|███▏ | 476/1500 [05:08<11:00, 1.55it/s, loss=0.0925, lr=1] Steps: 32%|███▏ | 477/1500 [05:09<10:59, 1.55it/s, loss=0.0925, lr=1] Steps: 32%|███▏ | 477/1500 [05:09<10:59, 1.55it/s, loss=0.0624, lr=1] Steps: 32%|███▏ | 478/1500 [05:10<10:58, 1.55it/s, loss=0.0624, lr=1] Steps: 32%|███▏ | 478/1500 [05:10<10:58, 1.55it/s, loss=0.134, lr=1] Steps: 32%|███▏ | 479/1500 [05:10<10:57, 1.55it/s, loss=0.134, lr=1] Steps: 32%|███▏ | 479/1500 [05:10<10:57, 1.55it/s, loss=0.244, lr=1] Steps: 32%|███▏ | 480/1500 [05:11<10:57, 1.55it/s, loss=0.244, lr=1] Steps: 32%|███▏ | 480/1500 [05:11<10:57, 1.55it/s, loss=0.18, lr=1] Steps: 32%|███▏ | 481/1500 [05:12<11:00, 1.54it/s, loss=0.18, lr=1] Steps: 32%|███▏ | 481/1500 [05:12<11:00, 1.54it/s, loss=0.0471, lr=1] Steps: 32%|███▏ | 482/1500 [05:12<10:58, 1.55it/s, loss=0.0471, lr=1] Steps: 32%|███▏ | 482/1500 [05:12<10:58, 1.55it/s, loss=0.295, lr=1] Steps: 32%|███▏ | 483/1500 [05:13<10:57, 1.55it/s, loss=0.295, lr=1] Steps: 32%|███▏ | 483/1500 [05:13<10:57, 1.55it/s, loss=0.28, lr=1] Steps: 32%|███▏ | 484/1500 [05:14<11:43, 1.44it/s, loss=0.28, lr=1] Steps: 32%|███▏ | 484/1500 [05:14<11:43, 1.44it/s, loss=0.117, lr=1] Steps: 32%|███▏ | 485/1500 [05:14<11:28, 1.47it/s, loss=0.117, lr=1] Steps: 32%|███▏ | 485/1500 [05:14<11:28, 1.47it/s, loss=0.183, lr=1] Steps: 32%|███▏ | 486/1500 [05:15<11:17, 1.50it/s, loss=0.183, lr=1] Steps: 32%|███▏ | 486/1500 [05:15<11:17, 1.50it/s, loss=0.0755, lr=1] Steps: 32%|███▏ | 487/1500 [05:16<11:09, 1.51it/s, loss=0.0755, lr=1] Steps: 32%|███▏ | 487/1500 [05:16<11:09, 1.51it/s, loss=0.112, lr=1] Steps: 33%|███▎ | 488/1500 [05:16<11:03, 1.53it/s, loss=0.112, lr=1] Steps: 33%|███▎ | 488/1500 [05:16<11:03, 1.53it/s, loss=0.191, lr=1] Steps: 33%|███▎ | 489/1500 [05:17<10:59, 1.53it/s, loss=0.191, lr=1] Steps: 33%|███▎ | 489/1500 [05:17<10:59, 1.53it/s, loss=0.172, lr=1] Steps: 33%|███▎ | 490/1500 [05:18<10:56, 1.54it/s, loss=0.172, lr=1] Steps: 33%|███▎ | 490/1500 [05:18<10:56, 1.54it/s, loss=0.108, lr=1] Steps: 33%|███▎ | 491/1500 [05:18<10:53, 1.54it/s, loss=0.108, lr=1] Steps: 33%|███▎ | 491/1500 [05:18<10:53, 1.54it/s, loss=0.118, lr=1] Steps: 33%|███▎ | 492/1500 [05:19<10:52, 1.55it/s, loss=0.118, lr=1] Steps: 33%|███▎ | 492/1500 [05:19<10:52, 1.55it/s, loss=0.124, lr=1] Steps: 33%|███▎ | 493/1500 [05:19<10:50, 1.55it/s, loss=0.124, lr=1] Steps: 33%|███▎ | 493/1500 [05:19<10:50, 1.55it/s, loss=0.0977, lr=1] Steps: 33%|███▎ | 494/1500 [05:20<10:49, 1.55it/s, loss=0.0977, lr=1] Steps: 33%|███▎ | 494/1500 [05:20<10:49, 1.55it/s, loss=0.149, lr=1] Steps: 33%|███▎ | 495/1500 [05:21<10:49, 1.55it/s, loss=0.149, lr=1] Steps: 33%|███▎ | 495/1500 [05:21<10:49, 1.55it/s, loss=0.127, lr=1] Steps: 33%|███▎ | 496/1500 [05:21<10:48, 1.55it/s, loss=0.127, lr=1] Steps: 33%|███▎ | 496/1500 [05:21<10:48, 1.55it/s, loss=0.0968, lr=1] Steps: 33%|███▎ | 497/1500 [05:22<10:51, 1.54it/s, loss=0.0968, lr=1] Steps: 33%|███▎ | 497/1500 [05:22<10:51, 1.54it/s, loss=0.246, lr=1] Steps: 33%|███▎ | 498/1500 [05:23<10:49, 1.54it/s, loss=0.246, lr=1] Steps: 33%|███▎ | 498/1500 [05:23<10:49, 1.54it/s, loss=0.211, lr=1] Steps: 33%|███▎ | 499/1500 [05:23<10:47, 1.54it/s, loss=0.211, lr=1] Steps: 33%|███▎ | 499/1500 [05:23<10:47, 1.54it/s, loss=0.0595, lr=1] Steps: 33%|███▎ | 500/1500 [05:24<10:47, 1.54it/s, loss=0.0595, lr=1] Steps: 33%|███▎ | 500/1500 [05:24<10:47, 1.54it/s, loss=0.217, lr=1] Steps: 33%|███▎ | 501/1500 [05:25<10:46, 1.55it/s, loss=0.217, lr=1] Steps: 33%|███▎ | 501/1500 [05:25<10:46, 1.55it/s, loss=0.234, lr=1] Steps: 33%|███▎ | 502/1500 [05:25<10:45, 1.55it/s, loss=0.234, lr=1] Steps: 33%|███▎ | 502/1500 [05:25<10:45, 1.55it/s, loss=0.189, lr=1] Steps: 34%|███▎ | 503/1500 [05:26<10:45, 1.55it/s, loss=0.189, lr=1] Steps: 34%|███▎ | 503/1500 [05:26<10:45, 1.55it/s, loss=0.0568, lr=1] Steps: 34%|███▎ | 504/1500 [05:27<10:43, 1.55it/s, loss=0.0568, lr=1] Steps: 34%|███▎ | 504/1500 [05:27<10:43, 1.55it/s, loss=0.124, lr=1] Steps: 34%|███▎ | 505/1500 [05:27<10:42, 1.55it/s, loss=0.124, lr=1] Steps: 34%|███▎ | 505/1500 [05:27<10:42, 1.55it/s, loss=0.144, lr=1] Steps: 34%|███▎ | 506/1500 [05:28<10:41, 1.55it/s, loss=0.144, lr=1] Steps: 34%|███▎ | 506/1500 [05:28<10:41, 1.55it/s, loss=0.0346, lr=1] Steps: 34%|███▍ | 507/1500 [05:28<10:41, 1.55it/s, loss=0.0346, lr=1] Steps: 34%|███▍ | 507/1500 [05:28<10:41, 1.55it/s, loss=0.0517, lr=1] Steps: 34%|███▍ | 508/1500 [05:29<10:40, 1.55it/s, loss=0.0517, lr=1] Steps: 34%|███▍ | 508/1500 [05:29<10:40, 1.55it/s, loss=0.119, lr=1] Steps: 34%|███▍ | 509/1500 [05:30<10:39, 1.55it/s, loss=0.119, lr=1] Steps: 34%|███▍ | 509/1500 [05:30<10:39, 1.55it/s, loss=0.095, lr=1] Steps: 34%|███▍ | 510/1500 [05:30<10:38, 1.55it/s, loss=0.095, lr=1] Steps: 34%|███▍ | 510/1500 [05:30<10:38, 1.55it/s, loss=0.18, lr=1] Steps: 34%|███▍ | 511/1500 [05:31<10:38, 1.55it/s, loss=0.18, lr=1] Steps: 34%|███▍ | 511/1500 [05:31<10:38, 1.55it/s, loss=0.247, lr=1] Steps: 34%|███▍ | 512/1500 [05:32<10:37, 1.55it/s, loss=0.247, lr=1] Steps: 34%|███▍ | 512/1500 [05:32<10:37, 1.55it/s, loss=0.0814, lr=1] Steps: 34%|███▍ | 513/1500 [05:32<10:40, 1.54it/s, loss=0.0814, lr=1] Steps: 34%|███▍ | 513/1500 [05:32<10:40, 1.54it/s, loss=0.135, lr=1] Steps: 34%|███▍ | 514/1500 [05:33<10:39, 1.54it/s, loss=0.135, lr=1] Steps: 34%|███▍ | 514/1500 [05:33<10:39, 1.54it/s, loss=0.201, lr=1] Steps: 34%|███▍ | 515/1500 [05:34<10:40, 1.54it/s, loss=0.201, lr=1] Steps: 34%|███▍ | 515/1500 [05:34<10:40, 1.54it/s, loss=0.198, lr=1] Steps: 34%|███▍ | 516/1500 [05:34<10:43, 1.53it/s, loss=0.198, lr=1] Steps: 34%|███▍ | 516/1500 [05:34<10:43, 1.53it/s, loss=0.0714, lr=1] Steps: 34%|███▍ | 517/1500 [05:35<10:40, 1.54it/s, loss=0.0714, lr=1] Steps: 34%|███▍ | 517/1500 [05:35<10:40, 1.54it/s, loss=0.112, lr=1] Steps: 35%|███▍ | 518/1500 [05:36<10:37, 1.54it/s, loss=0.112, lr=1] Steps: 35%|███▍ | 518/1500 [05:36<10:37, 1.54it/s, loss=0.0658, lr=1] Steps: 35%|███▍ | 519/1500 [05:36<10:35, 1.54it/s, loss=0.0658, lr=1] Steps: 35%|███▍ | 519/1500 [05:36<10:35, 1.54it/s, loss=0.157, lr=1] Steps: 35%|███▍ | 520/1500 [05:37<10:34, 1.54it/s, loss=0.157, lr=1] Steps: 35%|███▍ | 520/1500 [05:37<10:34, 1.54it/s, loss=0.0401, lr=1] Steps: 35%|███▍ | 521/1500 [05:38<10:33, 1.54it/s, loss=0.0401, lr=1] Steps: 35%|███▍ | 521/1500 [05:38<10:33, 1.54it/s, loss=0.0419, lr=1] Steps: 35%|███▍ | 522/1500 [05:38<10:31, 1.55it/s, loss=0.0419, lr=1] Steps: 35%|███▍ | 522/1500 [05:38<10:31, 1.55it/s, loss=0.129, lr=1] Steps: 35%|███▍ | 523/1500 [05:39<10:30, 1.55it/s, loss=0.129, lr=1] Steps: 35%|███▍ | 523/1500 [05:39<10:30, 1.55it/s, loss=0.109, lr=1] Steps: 35%|███▍ | 524/1500 [05:40<10:29, 1.55it/s, loss=0.109, lr=1] Steps: 35%|███▍ | 524/1500 [05:40<10:29, 1.55it/s, loss=0.0802, lr=1] Steps: 35%|███▌ | 525/1500 [05:40<10:28, 1.55it/s, loss=0.0802, lr=1] Steps: 35%|███▌ | 525/1500 [05:40<10:28, 1.55it/s, loss=0.156, lr=1] Steps: 35%|███▌ | 526/1500 [05:41<10:27, 1.55it/s, loss=0.156, lr=1] Steps: 35%|███▌ | 526/1500 [05:41<10:27, 1.55it/s, loss=0.115, lr=1] Steps: 35%|███▌ | 527/1500 [05:41<10:26, 1.55it/s, loss=0.115, lr=1] Steps: 35%|███▌ | 527/1500 [05:41<10:26, 1.55it/s, loss=0.279, lr=1] Steps: 35%|███▌ | 528/1500 [05:42<10:25, 1.55it/s, loss=0.279, lr=1] Steps: 35%|███▌ | 528/1500 [05:42<10:25, 1.55it/s, loss=0.237, lr=1] Steps: 35%|███▌ | 529/1500 [05:43<10:29, 1.54it/s, loss=0.237, lr=1] Steps: 35%|███▌ | 529/1500 [05:43<10:29, 1.54it/s, loss=0.057, lr=1] Steps: 35%|███▌ | 530/1500 [05:43<10:27, 1.55it/s, loss=0.057, lr=1] Steps: 35%|███▌ | 530/1500 [05:43<10:27, 1.55it/s, loss=0.179, lr=1] Steps: 35%|███▌ | 531/1500 [05:44<10:25, 1.55it/s, loss=0.179, lr=1] Steps: 35%|███▌ | 531/1500 [05:44<10:25, 1.55it/s, loss=0.139, lr=1] Steps: 35%|███▌ | 532/1500 [05:45<10:24, 1.55it/s, loss=0.139, lr=1] Steps: 35%|███▌ | 532/1500 [05:45<10:24, 1.55it/s, loss=0.126, lr=1] Steps: 36%|███▌ | 533/1500 [05:45<10:23, 1.55it/s, loss=0.126, lr=1] Steps: 36%|███▌ | 533/1500 [05:45<10:23, 1.55it/s, loss=0.0814, lr=1] Steps: 36%|███▌ | 534/1500 [05:46<10:22, 1.55it/s, loss=0.0814, lr=1] Steps: 36%|███▌ | 534/1500 [05:46<10:22, 1.55it/s, loss=0.191, lr=1] Steps: 36%|███▌ | 535/1500 [05:47<10:22, 1.55it/s, loss=0.191, lr=1] Steps: 36%|███▌ | 535/1500 [05:47<10:22, 1.55it/s, loss=0.171, lr=1] Steps: 36%|███▌ | 536/1500 [05:47<10:21, 1.55it/s, loss=0.171, lr=1] Steps: 36%|███▌ | 536/1500 [05:47<10:21, 1.55it/s, loss=0.129, lr=1] Steps: 36%|███▌ | 537/1500 [05:48<10:20, 1.55it/s, loss=0.129, lr=1] Steps: 36%|███▌ | 537/1500 [05:48<10:20, 1.55it/s, loss=0.067, lr=1] Steps: 36%|███▌ | 538/1500 [05:49<10:19, 1.55it/s, loss=0.067, lr=1] Steps: 36%|███▌ | 538/1500 [05:49<10:19, 1.55it/s, loss=0.279, lr=1] Steps: 36%|███▌ | 539/1500 [05:49<10:18, 1.55it/s, loss=0.279, lr=1] Steps: 36%|███▌ | 539/1500 [05:49<10:18, 1.55it/s, loss=0.277, lr=1] Steps: 36%|███▌ | 540/1500 [05:50<10:17, 1.55it/s, loss=0.277, lr=1] Steps: 36%|███▌ | 540/1500 [05:50<10:17, 1.55it/s, loss=0.257, lr=1] Steps: 36%|███▌ | 541/1500 [05:50<10:16, 1.55it/s, loss=0.257, lr=1] Steps: 36%|███▌ | 541/1500 [05:50<10:16, 1.55it/s, loss=0.348, lr=1] Steps: 36%|███▌ | 542/1500 [05:51<10:15, 1.56it/s, loss=0.348, lr=1] Steps: 36%|███▌ | 542/1500 [05:51<10:15, 1.56it/s, loss=0.061, lr=1] Steps: 36%|███▌ | 543/1500 [05:52<10:15, 1.55it/s, loss=0.061, lr=1] Steps: 36%|███▌ | 543/1500 [05:52<10:15, 1.55it/s, loss=0.0925, lr=1] Steps: 36%|███▋ | 544/1500 [05:52<10:15, 1.55it/s, loss=0.0925, lr=1] Steps: 36%|███▋ | 544/1500 [05:52<10:15, 1.55it/s, loss=0.206, lr=1] Steps: 36%|███▋ | 545/1500 [05:53<10:17, 1.55it/s, loss=0.206, lr=1] Steps: 36%|███▋ | 545/1500 [05:53<10:17, 1.55it/s, loss=0.0522, lr=1] Steps: 36%|███▋ | 546/1500 [05:54<10:16, 1.55it/s, loss=0.0522, lr=1] Steps: 36%|███▋ | 546/1500 [05:54<10:16, 1.55it/s, loss=0.0838, lr=1] Steps: 36%|███▋ | 547/1500 [05:54<10:17, 1.54it/s, loss=0.0838, lr=1] Steps: 36%|███▋ | 547/1500 [05:54<10:17, 1.54it/s, loss=0.0663, lr=1] Steps: 37%|███▋ | 548/1500 [05:55<10:15, 1.55it/s, loss=0.0663, lr=1] Steps: 37%|███▋ | 548/1500 [05:55<10:15, 1.55it/s, loss=0.157, lr=1] Steps: 37%|███▋ | 549/1500 [05:56<10:15, 1.55it/s, loss=0.157, lr=1] Steps: 37%|███▋ | 549/1500 [05:56<10:15, 1.55it/s, loss=0.0236, lr=1] Steps: 37%|███▋ | 550/1500 [05:56<10:15, 1.54it/s, loss=0.0236, lr=1] Steps: 37%|███▋ | 550/1500 [05:56<10:15, 1.54it/s, loss=0.087, lr=1] Steps: 37%|███▋ | 551/1500 [05:57<10:14, 1.55it/s, loss=0.087, lr=1] Steps: 37%|███▋ | 551/1500 [05:57<10:14, 1.55it/s, loss=0.0453, lr=1] Steps: 37%|███▋ | 552/1500 [05:58<10:12, 1.55it/s, loss=0.0453, lr=1] Steps: 37%|███▋ | 552/1500 [05:58<10:12, 1.55it/s, loss=0.146, lr=1] Steps: 37%|███▋ | 553/1500 [05:58<10:11, 1.55it/s, loss=0.146, lr=1] Steps: 37%|███▋ | 553/1500 [05:58<10:11, 1.55it/s, loss=0.154, lr=1] Steps: 37%|███▋ | 554/1500 [05:59<10:10, 1.55it/s, loss=0.154, lr=1] Steps: 37%|███▋ | 554/1500 [05:59<10:10, 1.55it/s, loss=0.173, lr=1] Steps: 37%|███▋ | 555/1500 [06:00<10:09, 1.55it/s, loss=0.173, lr=1] Steps: 37%|███▋ | 555/1500 [06:00<10:09, 1.55it/s, loss=0.0916, lr=1] Steps: 37%|███▋ | 556/1500 [06:00<10:08, 1.55it/s, loss=0.0916, lr=1] Steps: 37%|███▋ | 556/1500 [06:00<10:08, 1.55it/s, loss=0.023, lr=1] Steps: 37%|███▋ | 557/1500 [06:01<10:08, 1.55it/s, loss=0.023, lr=1] Steps: 37%|███▋ | 557/1500 [06:01<10:08, 1.55it/s, loss=0.14, lr=1] Steps: 37%|███▋ | 558/1500 [06:01<10:07, 1.55it/s, loss=0.14, lr=1] Steps: 37%|███▋ | 558/1500 [06:01<10:07, 1.55it/s, loss=0.125, lr=1] Steps: 37%|███▋ | 559/1500 [06:02<10:06, 1.55it/s, loss=0.125, lr=1] Steps: 37%|███▋ | 559/1500 [06:02<10:06, 1.55it/s, loss=0.146, lr=1] Steps: 37%|███▋ | 560/1500 [06:03<10:05, 1.55it/s, loss=0.146, lr=1] Steps: 37%|███▋ | 560/1500 [06:03<10:05, 1.55it/s, loss=0.0552, lr=1] Steps: 37%|███▋ | 561/1500 [06:03<10:09, 1.54it/s, loss=0.0552, lr=1] Steps: 37%|███▋ | 561/1500 [06:03<10:09, 1.54it/s, loss=0.207, lr=1] Steps: 37%|███▋ | 562/1500 [06:04<10:07, 1.54it/s, loss=0.207, lr=1] Steps: 37%|███▋ | 562/1500 [06:04<10:07, 1.54it/s, loss=0.189, lr=1] Steps: 38%|███▊ | 563/1500 [06:05<10:05, 1.55it/s, loss=0.189, lr=1] Steps: 38%|███▊ | 563/1500 [06:05<10:05, 1.55it/s, loss=0.168, lr=1] Steps: 38%|███▊ | 564/1500 [06:05<10:04, 1.55it/s, loss=0.168, lr=1] Steps: 38%|███▊ | 564/1500 [06:05<10:04, 1.55it/s, loss=0.0482, lr=1] Steps: 38%|███▊ | 565/1500 [06:06<10:03, 1.55it/s, loss=0.0482, lr=1] Steps: 38%|███▊ | 565/1500 [06:06<10:03, 1.55it/s, loss=0.113, lr=1] Steps: 38%|███▊ | 566/1500 [06:07<10:02, 1.55it/s, loss=0.113, lr=1] Steps: 38%|███▊ | 566/1500 [06:07<10:02, 1.55it/s, loss=0.108, lr=1] Steps: 38%|███▊ | 567/1500 [06:07<10:01, 1.55it/s, loss=0.108, lr=1] Steps: 38%|███▊ | 567/1500 [06:07<10:01, 1.55it/s, loss=0.137, lr=1] Steps: 38%|███▊ | 568/1500 [06:08<10:00, 1.55it/s, loss=0.137, lr=1] Steps: 38%|███▊ | 568/1500 [06:08<10:00, 1.55it/s, loss=0.167, lr=1] Steps: 38%|███▊ | 569/1500 [06:09<09:59, 1.55it/s, loss=0.167, lr=1] Steps: 38%|███▊ | 569/1500 [06:09<09:59, 1.55it/s, loss=0.195, lr=1] Steps: 38%|███▊ | 570/1500 [06:09<09:59, 1.55it/s, loss=0.195, lr=1] Steps: 38%|███▊ | 570/1500 [06:09<09:59, 1.55it/s, loss=0.104, lr=1] Steps: 38%|███▊ | 571/1500 [06:10<09:58, 1.55it/s, loss=0.104, lr=1] Steps: 38%|███▊ | 571/1500 [06:10<09:58, 1.55it/s, loss=0.129, lr=1] Steps: 38%|███▊ | 572/1500 [06:10<09:58, 1.55it/s, loss=0.129, lr=1] Steps: 38%|███▊ | 572/1500 [06:10<09:58, 1.55it/s, loss=0.0649, lr=1] Steps: 38%|███▊ | 573/1500 [06:11<09:57, 1.55it/s, loss=0.0649, lr=1] Steps: 38%|███▊ | 573/1500 [06:11<09:57, 1.55it/s, loss=0.156, lr=1] Steps: 38%|███▊ | 574/1500 [06:12<09:57, 1.55it/s, loss=0.156, lr=1] Steps: 38%|███▊ | 574/1500 [06:12<09:57, 1.55it/s, loss=0.0962, lr=1] Steps: 38%|███▊ | 575/1500 [06:12<09:56, 1.55it/s, loss=0.0962, lr=1] Steps: 38%|███▊ | 575/1500 [06:12<09:56, 1.55it/s, loss=0.139, lr=1] Steps: 38%|███▊ | 576/1500 [06:13<09:55, 1.55it/s, loss=0.139, lr=1] Steps: 38%|███▊ | 576/1500 [06:13<09:55, 1.55it/s, loss=0.232, lr=1] Steps: 38%|███▊ | 577/1500 [06:14<09:58, 1.54it/s, loss=0.232, lr=1] Steps: 38%|███▊ | 577/1500 [06:14<09:58, 1.54it/s, loss=0.174, lr=1] Steps: 39%|███▊ | 578/1500 [06:14<09:58, 1.54it/s, loss=0.174, lr=1] Steps: 39%|███▊ | 578/1500 [06:14<09:58, 1.54it/s, loss=0.0637, lr=1] Steps: 39%|███▊ | 579/1500 [06:15<09:56, 1.54it/s, loss=0.0637, lr=1] Steps: 39%|███▊ | 579/1500 [06:15<09:56, 1.54it/s, loss=0.11, lr=1] Steps: 39%|███▊ | 580/1500 [06:16<09:54, 1.55it/s, loss=0.11, lr=1] Steps: 39%|███▊ | 580/1500 [06:16<09:54, 1.55it/s, loss=0.152, lr=1] Steps: 39%|███▊ | 581/1500 [06:16<09:53, 1.55it/s, loss=0.152, lr=1] Steps: 39%|███▊ | 581/1500 [06:16<09:53, 1.55it/s, loss=0.166, lr=1] Steps: 39%|███▉ | 582/1500 [06:17<09:52, 1.55it/s, loss=0.166, lr=1] Steps: 39%|███▉ | 582/1500 [06:17<09:52, 1.55it/s, loss=0.0905, lr=1] Steps: 39%|███▉ | 583/1500 [06:18<09:52, 1.55it/s, loss=0.0905, lr=1] Steps: 39%|███▉ | 583/1500 [06:18<09:52, 1.55it/s, loss=0.112, lr=1] Steps: 39%|███▉ | 584/1500 [06:18<09:51, 1.55it/s, loss=0.112, lr=1] Steps: 39%|███▉ | 584/1500 [06:18<09:51, 1.55it/s, loss=0.22, lr=1] Steps: 39%|███▉ | 585/1500 [06:19<09:50, 1.55it/s, loss=0.22, lr=1] Steps: 39%|███▉ | 585/1500 [06:19<09:50, 1.55it/s, loss=0.281, lr=1] Steps: 39%|███▉ | 586/1500 [06:20<09:49, 1.55it/s, loss=0.281, lr=1] Steps: 39%|███▉ | 586/1500 [06:20<09:49, 1.55it/s, loss=0.103, lr=1] Steps: 39%|███▉ | 587/1500 [06:20<09:48, 1.55it/s, loss=0.103, lr=1] Steps: 39%|███▉ | 587/1500 [06:20<09:48, 1.55it/s, loss=0.141, lr=1] Steps: 39%|███▉ | 588/1500 [06:21<09:48, 1.55it/s, loss=0.141, lr=1] Steps: 39%|███▉ | 588/1500 [06:21<09:48, 1.55it/s, loss=0.109, lr=1] Steps: 39%|███▉ | 589/1500 [06:21<09:47, 1.55it/s, loss=0.109, lr=1] Steps: 39%|███▉ | 589/1500 [06:21<09:47, 1.55it/s, loss=0.127, lr=1] Steps: 39%|███▉ | 590/1500 [06:22<09:46, 1.55it/s, loss=0.127, lr=1] Steps: 39%|███▉ | 590/1500 [06:22<09:46, 1.55it/s, loss=0.125, lr=1] Steps: 39%|███▉ | 591/1500 [06:23<09:45, 1.55it/s, loss=0.125, lr=1] Steps: 39%|███▉ | 591/1500 [06:23<09:45, 1.55it/s, loss=0.0589, lr=1] Steps: 39%|███▉ | 592/1500 [06:23<09:45, 1.55it/s, loss=0.0589, lr=1] Steps: 39%|███▉ | 592/1500 [06:23<09:45, 1.55it/s, loss=0.251, lr=1] Steps: 40%|███▉ | 593/1500 [06:24<09:48, 1.54it/s, loss=0.251, lr=1] Steps: 40%|███▉ | 593/1500 [06:24<09:48, 1.54it/s, loss=0.218, lr=1] Steps: 40%|███▉ | 594/1500 [06:25<09:47, 1.54it/s, loss=0.218, lr=1] Steps: 40%|███▉ | 594/1500 [06:25<09:47, 1.54it/s, loss=0.169, lr=1] Steps: 40%|███▉ | 595/1500 [06:25<09:46, 1.54it/s, loss=0.169, lr=1] Steps: 40%|███▉ | 595/1500 [06:25<09:46, 1.54it/s, loss=0.0959, lr=1] Steps: 40%|███▉ | 596/1500 [06:26<09:45, 1.55it/s, loss=0.0959, lr=1] Steps: 40%|███▉ | 596/1500 [06:26<09:45, 1.55it/s, loss=0.135, lr=1] Steps: 40%|███▉ | 597/1500 [06:27<09:44, 1.55it/s, loss=0.135, lr=1] Steps: 40%|███▉ | 597/1500 [06:27<09:44, 1.55it/s, loss=0.137, lr=1] Steps: 40%|███▉ | 598/1500 [06:27<09:43, 1.55it/s, loss=0.137, lr=1] Steps: 40%|███▉ | 598/1500 [06:27<09:43, 1.55it/s, loss=0.229, lr=1] Steps: 40%|███▉ | 599/1500 [06:28<09:41, 1.55it/s, loss=0.229, lr=1] Steps: 40%|███▉ | 599/1500 [06:28<09:41, 1.55it/s, loss=0.0752, lr=1] Steps: 40%|████ | 600/1500 [06:29<09:40, 1.55it/s, loss=0.0752, lr=1] Steps: 40%|████ | 600/1500 [06:29<09:40, 1.55it/s, loss=0.246, lr=1] Steps: 40%|████ | 601/1500 [06:29<09:39, 1.55it/s, loss=0.246, lr=1] Steps: 40%|████ | 601/1500 [06:29<09:39, 1.55it/s, loss=0.176, lr=1] Steps: 40%|████ | 602/1500 [06:30<09:38, 1.55it/s, loss=0.176, lr=1] Steps: 40%|████ | 602/1500 [06:30<09:38, 1.55it/s, loss=0.153, lr=1] Steps: 40%|████ | 603/1500 [06:30<09:37, 1.55it/s, loss=0.153, lr=1] Steps: 40%|████ | 603/1500 [06:30<09:37, 1.55it/s, loss=0.174, lr=1] Steps: 40%|████ | 604/1500 [06:31<09:36, 1.55it/s, loss=0.174, lr=1] Steps: 40%|████ | 604/1500 [06:31<09:36, 1.55it/s, loss=0.124, lr=1] Steps: 40%|████ | 605/1500 [06:32<09:36, 1.55it/s, loss=0.124, lr=1] Steps: 40%|████ | 605/1500 [06:32<09:36, 1.55it/s, loss=0.141, lr=1] Steps: 40%|████ | 606/1500 [06:32<09:35, 1.55it/s, loss=0.141, lr=1] Steps: 40%|████ | 606/1500 [06:32<09:35, 1.55it/s, loss=0.0681, lr=1] Steps: 40%|████ | 607/1500 [06:33<09:34, 1.55it/s, loss=0.0681, lr=1] Steps: 40%|████ | 607/1500 [06:33<09:34, 1.55it/s, loss=0.148, lr=1] Steps: 41%|████ | 608/1500 [06:34<09:36, 1.55it/s, loss=0.148, lr=1] Steps: 41%|████ | 608/1500 [06:34<09:36, 1.55it/s, loss=0.0295, lr=1] Steps: 41%|████ | 609/1500 [06:34<09:45, 1.52it/s, loss=0.0295, lr=1] Steps: 41%|████ | 609/1500 [06:34<09:45, 1.52it/s, loss=0.127, lr=1] Steps: 41%|████ | 610/1500 [06:35<09:42, 1.53it/s, loss=0.127, lr=1] Steps: 41%|████ | 610/1500 [06:35<09:42, 1.53it/s, loss=0.213, lr=1] Steps: 41%|████ | 611/1500 [06:36<09:38, 1.54it/s, loss=0.213, lr=1] Steps: 41%|████ | 611/1500 [06:36<09:38, 1.54it/s, loss=0.0364, lr=1] Steps: 41%|████ | 612/1500 [06:36<09:36, 1.54it/s, loss=0.0364, lr=1] Steps: 41%|████ | 612/1500 [06:36<09:36, 1.54it/s, loss=0.237, lr=1] Steps: 41%|████ | 613/1500 [06:37<09:34, 1.54it/s, loss=0.237, lr=1] Steps: 41%|████ | 613/1500 [06:37<09:34, 1.54it/s, loss=0.0965, lr=1] Steps: 41%|████ | 614/1500 [06:38<09:33, 1.55it/s, loss=0.0965, lr=1] Steps: 41%|████ | 614/1500 [06:38<09:33, 1.55it/s, loss=0.141, lr=1] Steps: 41%|████ | 615/1500 [06:38<09:31, 1.55it/s, loss=0.141, lr=1] Steps: 41%|████ | 615/1500 [06:38<09:31, 1.55it/s, loss=0.131, lr=1] Steps: 41%|████ | 616/1500 [06:39<09:30, 1.55it/s, loss=0.131, lr=1] Steps: 41%|████ | 616/1500 [06:39<09:30, 1.55it/s, loss=0.0822, lr=1] Steps: 41%|████ | 617/1500 [06:40<09:29, 1.55it/s, loss=0.0822, lr=1] Steps: 41%|████ | 617/1500 [06:40<09:29, 1.55it/s, loss=0.108, lr=1] Steps: 41%|████ | 618/1500 [06:40<09:28, 1.55it/s, loss=0.108, lr=1] Steps: 41%|████ | 618/1500 [06:40<09:28, 1.55it/s, loss=0.0857, lr=1] Steps: 41%|████▏ | 619/1500 [06:41<09:28, 1.55it/s, loss=0.0857, lr=1] Steps: 41%|████▏ | 619/1500 [06:41<09:28, 1.55it/s, loss=0.0704, lr=1] Steps: 41%|████▏ | 620/1500 [06:41<09:27, 1.55it/s, loss=0.0704, lr=1] Steps: 41%|████▏ | 620/1500 [06:41<09:27, 1.55it/s, loss=0.0982, lr=1] Steps: 41%|████▏ | 621/1500 [06:42<09:26, 1.55it/s, loss=0.0982, lr=1] Steps: 41%|████▏ | 621/1500 [06:42<09:26, 1.55it/s, loss=0.145, lr=1] Steps: 41%|████▏ | 622/1500 [06:43<09:25, 1.55it/s, loss=0.145, lr=1] Steps: 41%|████▏ | 622/1500 [06:43<09:25, 1.55it/s, loss=0.144, lr=1] Steps: 42%|████▏ | 623/1500 [06:43<09:25, 1.55it/s, loss=0.144, lr=1] Steps: 42%|████▏ | 623/1500 [06:43<09:25, 1.55it/s, loss=0.167, lr=1] Steps: 42%|████▏ | 624/1500 [06:44<09:24, 1.55it/s, loss=0.167, lr=1] Steps: 42%|████▏ | 624/1500 [06:44<09:24, 1.55it/s, loss=0.119, lr=1] Steps: 42%|████▏ | 625/1500 [06:45<09:27, 1.54it/s, loss=0.119, lr=1] Steps: 42%|████▏ | 625/1500 [06:45<09:27, 1.54it/s, loss=0.0489, lr=1] Steps: 42%|████▏ | 626/1500 [06:45<09:25, 1.54it/s, loss=0.0489, lr=1] Steps: 42%|████▏ | 626/1500 [06:45<09:25, 1.54it/s, loss=0.0774, lr=1] Steps: 42%|████▏ | 627/1500 [06:46<09:24, 1.55it/s, loss=0.0774, lr=1] Steps: 42%|████▏ | 627/1500 [06:46<09:24, 1.55it/s, loss=0.1, lr=1] Steps: 42%|████▏ | 628/1500 [06:47<09:24, 1.54it/s, loss=0.1, lr=1] Steps: 42%|████▏ | 628/1500 [06:47<09:24, 1.54it/s, loss=0.13, lr=1] Steps: 42%|████▏ | 629/1500 [06:47<09:24, 1.54it/s, loss=0.13, lr=1] Steps: 42%|████▏ | 629/1500 [06:47<09:24, 1.54it/s, loss=0.0711, lr=1] Steps: 42%|████▏ | 630/1500 [06:48<09:24, 1.54it/s, loss=0.0711, lr=1] Steps: 42%|████▏ | 630/1500 [06:48<09:24, 1.54it/s, loss=0.118, lr=1] Steps: 42%|████▏ | 631/1500 [06:49<09:22, 1.54it/s, loss=0.118, lr=1] Steps: 42%|████▏ | 631/1500 [06:49<09:22, 1.54it/s, loss=0.109, lr=1] Steps: 42%|████▏ | 632/1500 [06:49<09:21, 1.55it/s, loss=0.109, lr=1] Steps: 42%|████▏ | 632/1500 [06:49<09:21, 1.55it/s, loss=0.0255, lr=1] Steps: 42%|████▏ | 633/1500 [06:50<09:20, 1.55it/s, loss=0.0255, lr=1] Steps: 42%|████▏ | 633/1500 [06:50<09:20, 1.55it/s, loss=0.0847, lr=1] Steps: 42%|████▏ | 634/1500 [06:51<09:20, 1.55it/s, loss=0.0847, lr=1] Steps: 42%|████▏ | 634/1500 [06:51<09:20, 1.55it/s, loss=0.141, lr=1] Steps: 42%|████▏ | 635/1500 [06:51<09:18, 1.55it/s, loss=0.141, lr=1] Steps: 42%|████▏ | 635/1500 [06:51<09:18, 1.55it/s, loss=0.167, lr=1] Steps: 42%|████▏ | 636/1500 [06:52<09:17, 1.55it/s, loss=0.167, lr=1] Steps: 42%|████▏ | 636/1500 [06:52<09:17, 1.55it/s, loss=0.0701, lr=1] Steps: 42%|████▏ | 637/1500 [06:52<09:16, 1.55it/s, loss=0.0701, lr=1] Steps: 42%|████▏ | 637/1500 [06:52<09:16, 1.55it/s, loss=0.161, lr=1] Steps: 43%|████▎ | 638/1500 [06:53<09:15, 1.55it/s, loss=0.161, lr=1] Steps: 43%|████▎ | 638/1500 [06:53<09:15, 1.55it/s, loss=0.22, lr=1] Steps: 43%|████▎ | 639/1500 [06:54<09:16, 1.55it/s, loss=0.22, lr=1] Steps: 43%|████▎ | 639/1500 [06:54<09:16, 1.55it/s, loss=0.0763, lr=1] Steps: 43%|████▎ | 640/1500 [06:54<09:14, 1.55it/s, loss=0.0763, lr=1] Steps: 43%|████▎ | 640/1500 [06:54<09:14, 1.55it/s, loss=0.125, lr=1] Steps: 43%|████▎ | 641/1500 [06:55<09:17, 1.54it/s, loss=0.125, lr=1] Steps: 43%|████▎ | 641/1500 [06:55<09:17, 1.54it/s, loss=0.15, lr=1] Steps: 43%|████▎ | 642/1500 [06:56<09:16, 1.54it/s, loss=0.15, lr=1] Steps: 43%|████▎ | 642/1500 [06:56<09:16, 1.54it/s, loss=0.112, lr=1] Steps: 43%|████▎ | 643/1500 [06:56<09:15, 1.54it/s, loss=0.112, lr=1] Steps: 43%|████▎ | 643/1500 [06:56<09:15, 1.54it/s, loss=0.146, lr=1] Steps: 43%|████▎ | 644/1500 [06:57<09:14, 1.54it/s, loss=0.146, lr=1] Steps: 43%|████▎ | 644/1500 [06:57<09:14, 1.54it/s, loss=0.0801, lr=1] Steps: 43%|████▎ | 645/1500 [06:58<09:12, 1.55it/s, loss=0.0801, lr=1] Steps: 43%|████▎ | 645/1500 [06:58<09:12, 1.55it/s, loss=0.0814, lr=1] Steps: 43%|████▎ | 646/1500 [06:58<09:12, 1.55it/s, loss=0.0814, lr=1] Steps: 43%|████▎ | 646/1500 [06:58<09:12, 1.55it/s, loss=0.0662, lr=1] Steps: 43%|████▎ | 647/1500 [06:59<09:12, 1.54it/s, loss=0.0662, lr=1] Steps: 43%|████▎ | 647/1500 [06:59<09:12, 1.54it/s, loss=0.161, lr=1] Steps: 43%|████▎ | 648/1500 [07:00<09:16, 1.53it/s, loss=0.161, lr=1] Steps: 43%|████▎ | 648/1500 [07:00<09:16, 1.53it/s, loss=0.149, lr=1] Steps: 43%|████▎ | 649/1500 [07:00<09:14, 1.53it/s, loss=0.149, lr=1] Steps: 43%|████▎ | 649/1500 [07:00<09:14, 1.53it/s, loss=0.0361, lr=1] Steps: 43%|████▎ | 650/1500 [07:01<09:13, 1.54it/s, loss=0.0361, lr=1] Steps: 43%|████▎ | 650/1500 [07:01<09:13, 1.54it/s, loss=0.193, lr=1] Steps: 43%|████▎ | 651/1500 [07:02<09:11, 1.54it/s, loss=0.193, lr=1] Steps: 43%|████▎ | 651/1500 [07:02<09:11, 1.54it/s, loss=0.13, lr=1] Steps: 43%|████▎ | 652/1500 [07:02<09:09, 1.54it/s, loss=0.13, lr=1] Steps: 43%|████▎ | 652/1500 [07:02<09:09, 1.54it/s, loss=0.118, lr=1] Steps: 44%|████▎ | 653/1500 [07:03<09:08, 1.55it/s, loss=0.118, lr=1] Steps: 44%|████▎ | 653/1500 [07:03<09:08, 1.55it/s, loss=0.0854, lr=1] Steps: 44%|████▎ | 654/1500 [07:04<09:06, 1.55it/s, loss=0.0854, lr=1] Steps: 44%|████▎ | 654/1500 [07:04<09:06, 1.55it/s, loss=0.107, lr=1] Steps: 44%|████▎ | 655/1500 [07:04<09:09, 1.54it/s, loss=0.107, lr=1] Steps: 44%|████▎ | 655/1500 [07:04<09:09, 1.54it/s, loss=0.0942, lr=1] Steps: 44%|████▎ | 656/1500 [07:05<09:14, 1.52it/s, loss=0.0942, lr=1] Steps: 44%|████▎ | 656/1500 [07:05<09:14, 1.52it/s, loss=0.0586, lr=1] Steps: 44%|████▍ | 657/1500 [07:05<09:13, 1.52it/s, loss=0.0586, lr=1] Steps: 44%|████▍ | 657/1500 [07:05<09:13, 1.52it/s, loss=0.169, lr=1] Steps: 44%|████▍ | 658/1500 [07:06<09:10, 1.53it/s, loss=0.169, lr=1] Steps: 44%|████▍ | 658/1500 [07:06<09:10, 1.53it/s, loss=0.0382, lr=1] Steps: 44%|████▍ | 659/1500 [07:07<09:10, 1.53it/s, loss=0.0382, lr=1] Steps: 44%|████▍ | 659/1500 [07:07<09:10, 1.53it/s, loss=0.106, lr=1] Steps: 44%|████▍ | 660/1500 [07:07<09:07, 1.53it/s, loss=0.106, lr=1] Steps: 44%|████▍ | 660/1500 [07:07<09:07, 1.53it/s, loss=0.244, lr=1] Steps: 44%|████▍ | 661/1500 [07:08<09:05, 1.54it/s, loss=0.244, lr=1] Steps: 44%|████▍ | 661/1500 [07:08<09:05, 1.54it/s, loss=0.0492, lr=1] Steps: 44%|████▍ | 662/1500 [07:09<09:04, 1.54it/s, loss=0.0492, lr=1] Steps: 44%|████▍ | 662/1500 [07:09<09:04, 1.54it/s, loss=0.26, lr=1] Steps: 44%|████▍ | 663/1500 [07:09<09:03, 1.54it/s, loss=0.26, lr=1] Steps: 44%|████▍ | 663/1500 [07:09<09:03, 1.54it/s, loss=0.0834, lr=1] Steps: 44%|████▍ | 664/1500 [07:10<09:02, 1.54it/s, loss=0.0834, lr=1] Steps: 44%|████▍ | 664/1500 [07:10<09:02, 1.54it/s, loss=0.0897, lr=1] Steps: 44%|████▍ | 665/1500 [07:11<09:01, 1.54it/s, loss=0.0897, lr=1] Steps: 44%|████▍ | 665/1500 [07:11<09:01, 1.54it/s, loss=0.105, lr=1] Steps: 44%|████▍ | 666/1500 [07:11<08:59, 1.54it/s, loss=0.105, lr=1] Steps: 44%|████▍ | 666/1500 [07:11<08:59, 1.54it/s, loss=0.156, lr=1] Steps: 44%|████▍ | 667/1500 [07:12<08:58, 1.55it/s, loss=0.156, lr=1] Steps: 44%|████▍ | 667/1500 [07:12<08:58, 1.55it/s, loss=0.183, lr=1] Steps: 45%|████▍ | 668/1500 [07:13<08:58, 1.54it/s, loss=0.183, lr=1] Steps: 45%|████▍ | 668/1500 [07:13<08:58, 1.54it/s, loss=0.275, lr=1] Steps: 45%|████▍ | 669/1500 [07:13<08:57, 1.54it/s, loss=0.275, lr=1] Steps: 45%|████▍ | 669/1500 [07:13<08:57, 1.54it/s, loss=0.0672, lr=1] Steps: 45%|████▍ | 670/1500 [07:14<08:57, 1.54it/s, loss=0.0672, lr=1] Steps: 45%|████▍ | 670/1500 [07:14<08:57, 1.54it/s, loss=0.126, lr=1] Steps: 45%|████▍ | 671/1500 [07:15<08:56, 1.54it/s, loss=0.126, lr=1] Steps: 45%|████▍ | 671/1500 [07:15<08:56, 1.54it/s, loss=0.116, lr=1] Steps: 45%|████▍ | 672/1500 [07:15<08:55, 1.55it/s, loss=0.116, lr=1] Steps: 45%|████▍ | 672/1500 [07:15<08:55, 1.55it/s, loss=0.116, lr=1] Steps: 45%|████▍ | 673/1500 [07:16<08:57, 1.54it/s, loss=0.116, lr=1] Steps: 45%|████▍ | 673/1500 [07:16<08:57, 1.54it/s, loss=0.106, lr=1] Steps: 45%|████▍ | 674/1500 [07:17<08:55, 1.54it/s, loss=0.106, lr=1] Steps: 45%|████▍ | 674/1500 [07:17<08:55, 1.54it/s, loss=0.239, lr=1] Steps: 45%|████▌ | 675/1500 [07:17<08:53, 1.55it/s, loss=0.239, lr=1] Steps: 45%|████▌ | 675/1500 [07:17<08:53, 1.55it/s, loss=0.0977, lr=1] Steps: 45%|████▌ | 676/1500 [07:18<08:52, 1.55it/s, loss=0.0977, lr=1] Steps: 45%|████▌ | 676/1500 [07:18<08:52, 1.55it/s, loss=0.226, lr=1] Steps: 45%|████▌ | 677/1500 [07:18<08:51, 1.55it/s, loss=0.226, lr=1] Steps: 45%|████▌ | 677/1500 [07:18<08:51, 1.55it/s, loss=0.0782, lr=1] Steps: 45%|████▌ | 678/1500 [07:19<08:50, 1.55it/s, loss=0.0782, lr=1] Steps: 45%|████▌ | 678/1500 [07:19<08:50, 1.55it/s, loss=0.0763, lr=1] Steps: 45%|████▌ | 679/1500 [07:20<08:50, 1.55it/s, loss=0.0763, lr=1] Steps: 45%|████▌ | 679/1500 [07:20<08:50, 1.55it/s, loss=0.144, lr=1] Steps: 45%|████▌ | 680/1500 [07:20<08:50, 1.54it/s, loss=0.144, lr=1] Steps: 45%|████▌ | 680/1500 [07:20<08:50, 1.54it/s, loss=0.0528, lr=1] Steps: 45%|████▌ | 681/1500 [07:21<08:50, 1.54it/s, loss=0.0528, lr=1] Steps: 45%|████▌ | 681/1500 [07:21<08:50, 1.54it/s, loss=0.135, lr=1] Steps: 45%|████▌ | 682/1500 [07:22<08:50, 1.54it/s, loss=0.135, lr=1] Steps: 45%|████▌ | 682/1500 [07:22<08:50, 1.54it/s, loss=0.193, lr=1] Steps: 46%|████▌ | 683/1500 [07:22<08:49, 1.54it/s, loss=0.193, lr=1] Steps: 46%|████▌ | 683/1500 [07:22<08:49, 1.54it/s, loss=0.153, lr=1] Steps: 46%|████▌ | 684/1500 [07:23<08:48, 1.54it/s, loss=0.153, lr=1] Steps: 46%|████▌ | 684/1500 [07:23<08:48, 1.54it/s, loss=0.131, lr=1] Steps: 46%|████▌ | 685/1500 [07:24<08:48, 1.54it/s, loss=0.131, lr=1] Steps: 46%|████▌ | 685/1500 [07:24<08:48, 1.54it/s, loss=0.0853, lr=1] Steps: 46%|████▌ | 686/1500 [07:24<08:47, 1.54it/s, loss=0.0853, lr=1] Steps: 46%|████▌ | 686/1500 [07:24<08:47, 1.54it/s, loss=0.106, lr=1] Steps: 46%|████▌ | 687/1500 [07:25<08:46, 1.54it/s, loss=0.106, lr=1] Steps: 46%|████▌ | 687/1500 [07:25<08:46, 1.54it/s, loss=0.259, lr=1] Steps: 46%|████▌ | 688/1500 [07:26<08:44, 1.55it/s, loss=0.259, lr=1] Steps: 46%|████▌ | 688/1500 [07:26<08:44, 1.55it/s, loss=0.163, lr=1] Steps: 46%|████▌ | 689/1500 [07:26<08:48, 1.53it/s, loss=0.163, lr=1] Steps: 46%|████▌ | 689/1500 [07:26<08:48, 1.53it/s, loss=0.0191, lr=1] Steps: 46%|████▌ | 690/1500 [07:27<08:46, 1.54it/s, loss=0.0191, lr=1] Steps: 46%|████▌ | 690/1500 [07:27<08:46, 1.54it/s, loss=0.00796, lr=1] Steps: 46%|████▌ | 691/1500 [07:28<08:44, 1.54it/s, loss=0.00796, lr=1] Steps: 46%|████▌ | 691/1500 [07:28<08:44, 1.54it/s, loss=0.154, lr=1] Steps: 46%|████▌ | 692/1500 [07:28<08:42, 1.55it/s, loss=0.154, lr=1] Steps: 46%|████▌ | 692/1500 [07:28<08:42, 1.55it/s, loss=0.0745, lr=1] Steps: 46%|████▌ | 693/1500 [07:29<08:41, 1.55it/s, loss=0.0745, lr=1] Steps: 46%|████▌ | 693/1500 [07:29<08:41, 1.55it/s, loss=0.159, lr=1] Steps: 46%|████▋ | 694/1500 [07:29<08:40, 1.55it/s, loss=0.159, lr=1] Steps: 46%|████▋ | 694/1500 [07:29<08:40, 1.55it/s, loss=0.148, lr=1] Steps: 46%|████▋ | 695/1500 [07:30<08:39, 1.55it/s, loss=0.148, lr=1] Steps: 46%|████▋ | 695/1500 [07:30<08:39, 1.55it/s, loss=0.233, lr=1] Steps: 46%|████▋ | 696/1500 [07:31<08:38, 1.55it/s, loss=0.233, lr=1] Steps: 46%|████▋ | 696/1500 [07:31<08:38, 1.55it/s, loss=0.119, lr=1] Steps: 46%|████▋ | 697/1500 [07:31<08:37, 1.55it/s, loss=0.119, lr=1] Steps: 46%|████▋ | 697/1500 [07:31<08:37, 1.55it/s, loss=0.111, lr=1] Steps: 47%|████▋ | 698/1500 [07:32<08:37, 1.55it/s, loss=0.111, lr=1] Steps: 47%|████▋ | 698/1500 [07:32<08:37, 1.55it/s, loss=0.206, lr=1] Steps: 47%|████▋ | 699/1500 [07:33<08:36, 1.55it/s, loss=0.206, lr=1] Steps: 47%|████▋ | 699/1500 [07:33<08:36, 1.55it/s, loss=0.154, lr=1] Steps: 47%|████▋ | 700/1500 [07:33<08:35, 1.55it/s, loss=0.154, lr=1] Steps: 47%|████▋ | 700/1500 [07:33<08:35, 1.55it/s, loss=0.246, lr=1] Steps: 47%|████▋ | 701/1500 [07:34<08:37, 1.54it/s, loss=0.246, lr=1] Steps: 47%|████▋ | 701/1500 [07:34<08:37, 1.54it/s, loss=0.025, lr=1] Steps: 47%|████▋ | 702/1500 [07:35<08:35, 1.55it/s, loss=0.025, lr=1] Steps: 47%|████▋ | 702/1500 [07:35<08:35, 1.55it/s, loss=0.161, lr=1] Steps: 47%|████▋ | 703/1500 [07:35<08:34, 1.55it/s, loss=0.161, lr=1] Steps: 47%|████▋ | 703/1500 [07:35<08:34, 1.55it/s, loss=0.0645, lr=1] Steps: 47%|████▋ | 704/1500 [07:36<08:33, 1.55it/s, loss=0.0645, lr=1] Steps: 47%|████▋ | 704/1500 [07:36<08:33, 1.55it/s, loss=0.135, lr=1] Steps: 47%|████▋ | 705/1500 [07:37<08:36, 1.54it/s, loss=0.135, lr=1] Steps: 47%|████▋ | 705/1500 [07:37<08:36, 1.54it/s, loss=0.0357, lr=1] Steps: 47%|████▋ | 706/1500 [07:37<08:34, 1.54it/s, loss=0.0357, lr=1] Steps: 47%|████▋ | 706/1500 [07:37<08:34, 1.54it/s, loss=0.141, lr=1] Steps: 47%|████▋ | 707/1500 [07:38<08:33, 1.55it/s, loss=0.141, lr=1] Steps: 47%|████▋ | 707/1500 [07:38<08:33, 1.55it/s, loss=0.0497, lr=1] Steps: 47%|████▋ | 708/1500 [07:39<08:31, 1.55it/s, loss=0.0497, lr=1] Steps: 47%|████▋ | 708/1500 [07:39<08:31, 1.55it/s, loss=0.136, lr=1] Steps: 47%|████▋ | 709/1500 [07:39<08:30, 1.55it/s, loss=0.136, lr=1] Steps: 47%|████▋ | 709/1500 [07:39<08:30, 1.55it/s, loss=0.0996, lr=1] Steps: 47%|████▋ | 710/1500 [07:40<08:29, 1.55it/s, loss=0.0996, lr=1] Steps: 47%|████▋ | 710/1500 [07:40<08:29, 1.55it/s, loss=0.0778, lr=1] Steps: 47%|████▋ | 711/1500 [07:40<08:28, 1.55it/s, loss=0.0778, lr=1] Steps: 47%|████▋ | 711/1500 [07:40<08:28, 1.55it/s, loss=0.118, lr=1] Steps: 47%|████▋ | 712/1500 [07:41<08:28, 1.55it/s, loss=0.118, lr=1] Steps: 47%|████▋ | 712/1500 [07:41<08:28, 1.55it/s, loss=0.131, lr=1] Steps: 48%|████▊ | 713/1500 [07:42<08:27, 1.55it/s, loss=0.131, lr=1] Steps: 48%|████▊ | 713/1500 [07:42<08:27, 1.55it/s, loss=0.13, lr=1] Steps: 48%|████▊ | 714/1500 [07:42<08:26, 1.55it/s, loss=0.13, lr=1] Steps: 48%|████▊ | 714/1500 [07:42<08:26, 1.55it/s, loss=0.141, lr=1] Steps: 48%|████▊ | 715/1500 [07:43<08:25, 1.55it/s, loss=0.141, lr=1] Steps: 48%|████▊ | 715/1500 [07:43<08:25, 1.55it/s, loss=0.0741, lr=1] Steps: 48%|████▊ | 716/1500 [07:44<08:26, 1.55it/s, loss=0.0741, lr=1] Steps: 48%|████▊ | 716/1500 [07:44<08:26, 1.55it/s, loss=0.0596, lr=1] Steps: 48%|████▊ | 717/1500 [07:44<08:25, 1.55it/s, loss=0.0596, lr=1] Steps: 48%|████▊ | 717/1500 [07:44<08:25, 1.55it/s, loss=0.051, lr=1] Steps: 48%|████▊ | 718/1500 [07:45<08:24, 1.55it/s, loss=0.051, lr=1] Steps: 48%|████▊ | 718/1500 [07:45<08:24, 1.55it/s, loss=0.0879, lr=1] Steps: 48%|████▊ | 719/1500 [07:46<08:23, 1.55it/s, loss=0.0879, lr=1] Steps: 48%|████▊ | 719/1500 [07:46<08:23, 1.55it/s, loss=0.158, lr=1] Steps: 48%|████▊ | 720/1500 [07:46<08:23, 1.55it/s, loss=0.158, lr=1] Steps: 48%|████▊ | 720/1500 [07:46<08:23, 1.55it/s, loss=0.11, lr=1] Steps: 48%|████▊ | 721/1500 [07:47<08:25, 1.54it/s, loss=0.11, lr=1] Steps: 48%|████▊ | 721/1500 [07:47<08:25, 1.54it/s, loss=0.0909, lr=1] Steps: 48%|████▊ | 722/1500 [07:48<08:23, 1.55it/s, loss=0.0909, lr=1] Steps: 48%|████▊ | 722/1500 [07:48<08:23, 1.55it/s, loss=0.224, lr=1] Steps: 48%|████▊ | 723/1500 [07:48<08:21, 1.55it/s, loss=0.224, lr=1] Steps: 48%|████▊ | 723/1500 [07:48<08:21, 1.55it/s, loss=0.118, lr=1] Steps: 48%|████▊ | 724/1500 [07:49<08:20, 1.55it/s, loss=0.118, lr=1] Steps: 48%|████▊ | 724/1500 [07:49<08:20, 1.55it/s, loss=0.153, lr=1] Steps: 48%|████▊ | 725/1500 [07:49<08:20, 1.55it/s, loss=0.153, lr=1] Steps: 48%|████▊ | 725/1500 [07:49<08:20, 1.55it/s, loss=0.174, lr=1] Steps: 48%|████▊ | 726/1500 [07:50<08:19, 1.55it/s, loss=0.174, lr=1] Steps: 48%|████▊ | 726/1500 [07:50<08:19, 1.55it/s, loss=0.0981, lr=1] Steps: 48%|████▊ | 727/1500 [07:51<08:18, 1.55it/s, loss=0.0981, lr=1] Steps: 48%|████▊ | 727/1500 [07:51<08:18, 1.55it/s, loss=0.149, lr=1] Steps: 49%|████▊ | 728/1500 [07:51<08:17, 1.55it/s, loss=0.149, lr=1] Steps: 49%|████▊ | 728/1500 [07:51<08:17, 1.55it/s, loss=0.0629, lr=1] Steps: 49%|████▊ | 729/1500 [07:52<08:17, 1.55it/s, loss=0.0629, lr=1] Steps: 49%|████▊ | 729/1500 [07:52<08:17, 1.55it/s, loss=0.0951, lr=1] Steps: 49%|████▊ | 730/1500 [07:53<08:16, 1.55it/s, loss=0.0951, lr=1] Steps: 49%|████▊ | 730/1500 [07:53<08:16, 1.55it/s, loss=0.0587, lr=1] Steps: 49%|████▊ | 731/1500 [07:53<08:15, 1.55it/s, loss=0.0587, lr=1] Steps: 49%|████▊ | 731/1500 [07:53<08:15, 1.55it/s, loss=0.204, lr=1] Steps: 49%|████▉ | 732/1500 [07:54<08:15, 1.55it/s, loss=0.204, lr=1] Steps: 49%|████▉ | 732/1500 [07:54<08:15, 1.55it/s, loss=0.114, lr=1] Steps: 49%|████▉ | 733/1500 [07:55<08:15, 1.55it/s, loss=0.114, lr=1] Steps: 49%|████▉ | 733/1500 [07:55<08:15, 1.55it/s, loss=0.282, lr=1] Steps: 49%|████▉ | 734/1500 [07:55<08:14, 1.55it/s, loss=0.282, lr=1] Steps: 49%|████▉ | 734/1500 [07:55<08:14, 1.55it/s, loss=0.166, lr=1] Steps: 49%|████▉ | 735/1500 [07:56<08:13, 1.55it/s, loss=0.166, lr=1] Steps: 49%|████▉ | 735/1500 [07:56<08:13, 1.55it/s, loss=0.139, lr=1] Steps: 49%|████▉ | 736/1500 [07:57<08:12, 1.55it/s, loss=0.139, lr=1] Steps: 49%|████▉ | 736/1500 [07:57<08:12, 1.55it/s, loss=0.101, lr=1] Steps: 49%|████▉ | 737/1500 [07:57<08:14, 1.54it/s, loss=0.101, lr=1] Steps: 49%|████▉ | 737/1500 [07:57<08:14, 1.54it/s, loss=0.103, lr=1] Steps: 49%|████▉ | 738/1500 [07:58<08:13, 1.54it/s, loss=0.103, lr=1] Steps: 49%|████▉ | 738/1500 [07:58<08:13, 1.54it/s, loss=0.216, lr=1] Steps: 49%|████▉ | 739/1500 [07:59<08:12, 1.54it/s, loss=0.216, lr=1] Steps: 49%|████▉ | 739/1500 [07:59<08:12, 1.54it/s, loss=0.166, lr=1] Steps: 49%|████▉ | 740/1500 [07:59<08:11, 1.55it/s, loss=0.166, lr=1] Steps: 49%|████▉ | 740/1500 [07:59<08:11, 1.55it/s, loss=0.0985, lr=1] Steps: 49%|████▉ | 741/1500 [08:00<08:10, 1.55it/s, loss=0.0985, lr=1] Steps: 49%|████▉ | 741/1500 [08:00<08:10, 1.55it/s, loss=0.184, lr=1] Steps: 49%|████▉ | 742/1500 [08:00<08:09, 1.55it/s, loss=0.184, lr=1] Steps: 49%|████▉ | 742/1500 [08:00<08:09, 1.55it/s, loss=0.0529, lr=1] Steps: 50%|████▉ | 743/1500 [08:01<08:08, 1.55it/s, loss=0.0529, lr=1] Steps: 50%|████▉ | 743/1500 [08:01<08:08, 1.55it/s, loss=0.0954, lr=1] Steps: 50%|████▉ | 744/1500 [08:02<08:07, 1.55it/s, loss=0.0954, lr=1] Steps: 50%|████▉ | 744/1500 [08:02<08:07, 1.55it/s, loss=0.126, lr=1] Steps: 50%|████▉ | 745/1500 [08:02<08:06, 1.55it/s, loss=0.126, lr=1] Steps: 50%|████▉ | 745/1500 [08:02<08:06, 1.55it/s, loss=0.162, lr=1] Steps: 50%|████▉ | 746/1500 [08:03<08:05, 1.55it/s, loss=0.162, lr=1] Steps: 50%|████▉ | 746/1500 [08:03<08:05, 1.55it/s, loss=0.103, lr=1] Steps: 50%|████▉ | 747/1500 [08:04<08:08, 1.54it/s, loss=0.103, lr=1] Steps: 50%|████▉ | 747/1500 [08:04<08:08, 1.54it/s, loss=0.161, lr=1] Steps: 50%|████▉ | 748/1500 [08:04<08:08, 1.54it/s, loss=0.161, lr=1] Steps: 50%|████▉ | 748/1500 [08:04<08:08, 1.54it/s, loss=0.132, lr=1] Steps: 50%|████▉ | 749/1500 [08:05<08:06, 1.54it/s, loss=0.132, lr=1] Steps: 50%|████▉ | 749/1500 [08:05<08:06, 1.54it/s, loss=0.119, lr=1] Steps: 50%|█████ | 750/1500 [08:06<08:04, 1.55it/s, loss=0.119, lr=1] Steps: 50%|█████ | 750/1500 [08:06<08:04, 1.55it/s, loss=0.0234, lr=1] Steps: 50%|█████ | 751/1500 [08:06<08:03, 1.55it/s, loss=0.0234, lr=1] Steps: 50%|█████ | 751/1500 [08:06<08:03, 1.55it/s, loss=0.153, lr=1] Steps: 50%|█████ | 752/1500 [08:07<08:03, 1.55it/s, loss=0.153, lr=1] Steps: 50%|█████ | 752/1500 [08:07<08:03, 1.55it/s, loss=0.039, lr=1] Steps: 50%|█████ | 753/1500 [08:08<08:05, 1.54it/s, loss=0.039, lr=1] Steps: 50%|█████ | 753/1500 [08:08<08:05, 1.54it/s, loss=0.17, lr=1] Steps: 50%|█████ | 754/1500 [08:08<08:03, 1.54it/s, loss=0.17, lr=1] Steps: 50%|█████ | 754/1500 [08:08<08:03, 1.54it/s, loss=0.228, lr=1] Steps: 50%|█████ | 755/1500 [08:09<08:02, 1.54it/s, loss=0.228, lr=1] Steps: 50%|█████ | 755/1500 [08:09<08:02, 1.54it/s, loss=0.18, lr=1] Steps: 50%|█████ | 756/1500 [08:10<08:01, 1.55it/s, loss=0.18, lr=1] Steps: 50%|█████ | 756/1500 [08:10<08:01, 1.55it/s, loss=0.0736, lr=1] Steps: 50%|█████ | 757/1500 [08:10<08:00, 1.55it/s, loss=0.0736, lr=1] Steps: 50%|█████ | 757/1500 [08:10<08:00, 1.55it/s, loss=0.139, lr=1] Steps: 51%|█████ | 758/1500 [08:11<07:59, 1.55it/s, loss=0.139, lr=1] Steps: 51%|█████ | 758/1500 [08:11<07:59, 1.55it/s, loss=0.0602, lr=1] Steps: 51%|█████ | 759/1500 [08:11<07:58, 1.55it/s, loss=0.0602, lr=1] Steps: 51%|█████ | 759/1500 [08:11<07:58, 1.55it/s, loss=0.039, lr=1] Steps: 51%|█████ | 760/1500 [08:12<07:57, 1.55it/s, loss=0.039, lr=1] Steps: 51%|█████ | 760/1500 [08:12<07:57, 1.55it/s, loss=0.136, lr=1] Steps: 51%|█████ | 761/1500 [08:13<07:56, 1.55it/s, loss=0.136, lr=1] Steps: 51%|█████ | 761/1500 [08:13<07:56, 1.55it/s, loss=0.179, lr=1] Steps: 51%|█████ | 762/1500 [08:13<07:55, 1.55it/s, loss=0.179, lr=1] Steps: 51%|█████ | 762/1500 [08:13<07:55, 1.55it/s, loss=0.137, lr=1] Steps: 51%|█████ | 763/1500 [08:14<07:55, 1.55it/s, loss=0.137, lr=1] Steps: 51%|█████ | 763/1500 [08:14<07:55, 1.55it/s, loss=0.187, lr=1] Steps: 51%|█████ | 764/1500 [08:15<07:54, 1.55it/s, loss=0.187, lr=1] Steps: 51%|█████ | 764/1500 [08:15<07:54, 1.55it/s, loss=0.109, lr=1] Steps: 51%|█████ | 765/1500 [08:15<07:53, 1.55it/s, loss=0.109, lr=1] Steps: 51%|█████ | 765/1500 [08:15<07:53, 1.55it/s, loss=0.125, lr=1] Steps: 51%|█████ | 766/1500 [08:16<07:53, 1.55it/s, loss=0.125, lr=1] Steps: 51%|█████ | 766/1500 [08:16<07:53, 1.55it/s, loss=0.0844, lr=1] Steps: 51%|█████ | 767/1500 [08:17<07:53, 1.55it/s, loss=0.0844, lr=1] Steps: 51%|█████ | 767/1500 [08:17<07:53, 1.55it/s, loss=0.133, lr=1] Steps: 51%|█████ | 768/1500 [08:17<07:52, 1.55it/s, loss=0.133, lr=1] Steps: 51%|█████ | 768/1500 [08:17<07:52, 1.55it/s, loss=0.167, lr=1] Steps: 51%|█████▏ | 769/1500 [08:18<07:54, 1.54it/s, loss=0.167, lr=1] Steps: 51%|█████▏ | 769/1500 [08:18<07:54, 1.54it/s, loss=0.092, lr=1] Steps: 51%|█████▏ | 770/1500 [08:19<07:52, 1.54it/s, loss=0.092, lr=1] Steps: 51%|█████▏ | 770/1500 [08:19<07:52, 1.54it/s, loss=0.25, lr=1] Steps: 51%|█████▏ | 771/1500 [08:19<07:51, 1.54it/s, loss=0.25, lr=1] Steps: 51%|█████▏ | 771/1500 [08:19<07:51, 1.54it/s, loss=0.225, lr=1] Steps: 51%|█████▏ | 772/1500 [08:20<07:50, 1.55it/s, loss=0.225, lr=1] Steps: 51%|█████▏ | 772/1500 [08:20<07:50, 1.55it/s, loss=0.0841, lr=1] Steps: 52%|█████▏ | 773/1500 [08:20<07:50, 1.55it/s, loss=0.0841, lr=1] Steps: 52%|█████▏ | 773/1500 [08:20<07:50, 1.55it/s, loss=0.0806, lr=1] Steps: 52%|█████▏ | 774/1500 [08:21<07:48, 1.55it/s, loss=0.0806, lr=1] Steps: 52%|█████▏ | 774/1500 [08:21<07:48, 1.55it/s, loss=0.0689, lr=1] Steps: 52%|█████▏ | 775/1500 [08:22<07:47, 1.55it/s, loss=0.0689, lr=1] Steps: 52%|█████▏ | 775/1500 [08:22<07:47, 1.55it/s, loss=0.0753, lr=1] Steps: 52%|█████▏ | 776/1500 [08:22<07:46, 1.55it/s, loss=0.0753, lr=1] Steps: 52%|█████▏ | 776/1500 [08:22<07:46, 1.55it/s, loss=0.0451, lr=1] Steps: 52%|█████▏ | 777/1500 [08:23<07:45, 1.55it/s, loss=0.0451, lr=1] Steps: 52%|█████▏ | 777/1500 [08:23<07:45, 1.55it/s, loss=0.213, lr=1] Steps: 52%|█████▏ | 778/1500 [08:24<07:45, 1.55it/s, loss=0.213, lr=1] Steps: 52%|█████▏ | 778/1500 [08:24<07:45, 1.55it/s, loss=0.143, lr=1] Steps: 52%|█████▏ | 779/1500 [08:24<07:45, 1.55it/s, loss=0.143, lr=1] Steps: 52%|█████▏ | 779/1500 [08:24<07:45, 1.55it/s, loss=0.222, lr=1] Steps: 52%|█████▏ | 780/1500 [08:25<07:44, 1.55it/s, loss=0.222, lr=1] Steps: 52%|█████▏ | 780/1500 [08:25<07:44, 1.55it/s, loss=0.0748, lr=1] Steps: 52%|█████▏ | 781/1500 [08:26<07:43, 1.55it/s, loss=0.0748, lr=1] Steps: 52%|█████▏ | 781/1500 [08:26<07:43, 1.55it/s, loss=0.258, lr=1] Steps: 52%|█████▏ | 782/1500 [08:26<07:42, 1.55it/s, loss=0.258, lr=1] Steps: 52%|█████▏ | 782/1500 [08:26<07:42, 1.55it/s, loss=0.127, lr=1] Steps: 52%|█████▏ | 783/1500 [08:27<07:42, 1.55it/s, loss=0.127, lr=1] Steps: 52%|█████▏ | 783/1500 [08:27<07:42, 1.55it/s, loss=0.0774, lr=1] Steps: 52%|█████▏ | 784/1500 [08:28<07:41, 1.55it/s, loss=0.0774, lr=1] Steps: 52%|█████▏ | 784/1500 [08:28<07:41, 1.55it/s, loss=0.244, lr=1] Steps: 52%|█████▏ | 785/1500 [08:28<07:44, 1.54it/s, loss=0.244, lr=1] Steps: 52%|█████▏ | 785/1500 [08:28<07:44, 1.54it/s, loss=0.0673, lr=1] Steps: 52%|█████▏ | 786/1500 [08:29<07:42, 1.54it/s, loss=0.0673, lr=1] Steps: 52%|█████▏ | 786/1500 [08:29<07:42, 1.54it/s, loss=0.165, lr=1] Steps: 52%|█████▏ | 787/1500 [08:30<07:41, 1.55it/s, loss=0.165, lr=1] Steps: 52%|█████▏ | 787/1500 [08:30<07:41, 1.55it/s, loss=0.0884, lr=1] Steps: 53%|█████▎ | 788/1500 [08:30<07:40, 1.55it/s, loss=0.0884, lr=1] Steps: 53%|█████▎ | 788/1500 [08:30<07:40, 1.55it/s, loss=0.0722, lr=1] Steps: 53%|█████▎ | 789/1500 [08:31<07:39, 1.55it/s, loss=0.0722, lr=1] Steps: 53%|█████▎ | 789/1500 [08:31<07:39, 1.55it/s, loss=0.138, lr=1] Steps: 53%|█████▎ | 790/1500 [08:31<07:38, 1.55it/s, loss=0.138, lr=1] Steps: 53%|█████▎ | 790/1500 [08:31<07:38, 1.55it/s, loss=0.159, lr=1] Steps: 53%|█████▎ | 791/1500 [08:32<07:37, 1.55it/s, loss=0.159, lr=1] Steps: 53%|█████▎ | 791/1500 [08:32<07:37, 1.55it/s, loss=0.0577, lr=1] Steps: 53%|█████▎ | 792/1500 [08:33<07:37, 1.55it/s, loss=0.0577, lr=1] Steps: 53%|█████▎ | 792/1500 [08:33<07:37, 1.55it/s, loss=0.146, lr=1] Steps: 53%|█████▎ | 793/1500 [08:33<07:36, 1.55it/s, loss=0.146, lr=1] Steps: 53%|█████▎ | 793/1500 [08:33<07:36, 1.55it/s, loss=0.0831, lr=1] Steps: 53%|█████▎ | 794/1500 [08:34<07:35, 1.55it/s, loss=0.0831, lr=1] Steps: 53%|█████▎ | 794/1500 [08:34<07:35, 1.55it/s, loss=0.139, lr=1] Steps: 53%|█████▎ | 795/1500 [08:35<07:35, 1.55it/s, loss=0.139, lr=1] Steps: 53%|█████▎ | 795/1500 [08:35<07:35, 1.55it/s, loss=0.119, lr=1] Steps: 53%|█████▎ | 796/1500 [08:35<07:34, 1.55it/s, loss=0.119, lr=1] Steps: 53%|█████▎ | 796/1500 [08:35<07:34, 1.55it/s, loss=0.13, lr=1] Steps: 53%|█████▎ | 797/1500 [08:36<07:33, 1.55it/s, loss=0.13, lr=1] Steps: 53%|█████▎ | 797/1500 [08:36<07:33, 1.55it/s, loss=0.0476, lr=1] Steps: 53%|█████▎ | 798/1500 [08:37<07:33, 1.55it/s, loss=0.0476, lr=1] Steps: 53%|█████▎ | 798/1500 [08:37<07:33, 1.55it/s, loss=0.14, lr=1] Steps: 53%|█████▎ | 799/1500 [08:37<07:32, 1.55it/s, loss=0.14, lr=1] Steps: 53%|█████▎ | 799/1500 [08:37<07:32, 1.55it/s, loss=0.114, lr=1] Steps: 53%|█████▎ | 800/1500 [08:38<07:31, 1.55it/s, loss=0.114, lr=1] Steps: 53%|█████▎ | 800/1500 [08:38<07:31, 1.55it/s, loss=0.158, lr=1] Steps: 53%|█████▎ | 801/1500 [08:39<07:33, 1.54it/s, loss=0.158, lr=1] Steps: 53%|█████▎ | 801/1500 [08:39<07:33, 1.54it/s, loss=0.16, lr=1] Steps: 53%|█████▎ | 802/1500 [08:39<07:31, 1.54it/s, loss=0.16, lr=1] Steps: 53%|█████▎ | 802/1500 [08:39<07:31, 1.54it/s, loss=0.108, lr=1] Steps: 54%|█████▎ | 803/1500 [08:40<07:30, 1.55it/s, loss=0.108, lr=1] Steps: 54%|█████▎ | 803/1500 [08:40<07:30, 1.55it/s, loss=0.0939, lr=1] Steps: 54%|█████▎ | 804/1500 [08:41<07:29, 1.55it/s, loss=0.0939, lr=1] Steps: 54%|█████▎ | 804/1500 [08:41<07:29, 1.55it/s, loss=0.0802, lr=1] Steps: 54%|█████▎ | 805/1500 [08:41<07:27, 1.55it/s, loss=0.0802, lr=1] Steps: 54%|█████▎ | 805/1500 [08:41<07:27, 1.55it/s, loss=0.0889, lr=1] Steps: 54%|█████▎ | 806/1500 [08:42<07:27, 1.55it/s, loss=0.0889, lr=1] Steps: 54%|█████▎ | 806/1500 [08:42<07:27, 1.55it/s, loss=0.0765, lr=1] Steps: 54%|█████▍ | 807/1500 [08:42<07:26, 1.55it/s, loss=0.0765, lr=1] Steps: 54%|█████▍ | 807/1500 [08:42<07:26, 1.55it/s, loss=0.083, lr=1] Steps: 54%|█████▍ | 808/1500 [08:43<07:25, 1.55it/s, loss=0.083, lr=1] Steps: 54%|█████▍ | 808/1500 [08:43<07:25, 1.55it/s, loss=0.151, lr=1] Steps: 54%|█████▍ | 809/1500 [08:44<07:25, 1.55it/s, loss=0.151, lr=1] Steps: 54%|█████▍ | 809/1500 [08:44<07:25, 1.55it/s, loss=0.0682, lr=1] Steps: 54%|█████▍ | 810/1500 [08:44<07:24, 1.55it/s, loss=0.0682, lr=1] Steps: 54%|█████▍ | 810/1500 [08:44<07:24, 1.55it/s, loss=0.0776, lr=1] Steps: 54%|█████▍ | 811/1500 [08:45<07:23, 1.55it/s, loss=0.0776, lr=1] Steps: 54%|█████▍ | 811/1500 [08:45<07:23, 1.55it/s, loss=0.166, lr=1] Steps: 54%|█████▍ | 812/1500 [08:46<07:23, 1.55it/s, loss=0.166, lr=1] Steps: 54%|█████▍ | 812/1500 [08:46<07:23, 1.55it/s, loss=0.142, lr=1] Steps: 54%|█████▍ | 813/1500 [08:46<07:22, 1.55it/s, loss=0.142, lr=1] Steps: 54%|█████▍ | 813/1500 [08:46<07:22, 1.55it/s, loss=0.12, lr=1] Steps: 54%|█████▍ | 814/1500 [08:47<07:22, 1.55it/s, loss=0.12, lr=1] Steps: 54%|█████▍ | 814/1500 [08:47<07:22, 1.55it/s, loss=0.119, lr=1] Steps: 54%|█████▍ | 815/1500 [08:48<07:21, 1.55it/s, loss=0.119, lr=1] Steps: 54%|█████▍ | 815/1500 [08:48<07:21, 1.55it/s, loss=0.122, lr=1] Steps: 54%|█████▍ | 816/1500 [08:48<07:20, 1.55it/s, loss=0.122, lr=1] Steps: 54%|█████▍ | 816/1500 [08:48<07:20, 1.55it/s, loss=0.23, lr=1] Steps: 54%|█████▍ | 817/1500 [08:49<07:22, 1.54it/s, loss=0.23, lr=1] Steps: 54%|█████▍ | 817/1500 [08:49<07:22, 1.54it/s, loss=0.162, lr=1] Steps: 55%|█████▍ | 818/1500 [08:50<07:20, 1.55it/s, loss=0.162, lr=1] Steps: 55%|█████▍ | 818/1500 [08:50<07:20, 1.55it/s, loss=0.159, lr=1] Steps: 55%|█████▍ | 819/1500 [08:50<07:19, 1.55it/s, loss=0.159, lr=1] Steps: 55%|█████▍ | 819/1500 [08:50<07:19, 1.55it/s, loss=0.0788, lr=1] Steps: 55%|█████▍ | 820/1500 [08:51<07:18, 1.55it/s, loss=0.0788, lr=1] Steps: 55%|█████▍ | 820/1500 [08:51<07:18, 1.55it/s, loss=0.192, lr=1] Steps: 55%|█████▍ | 821/1500 [08:51<07:17, 1.55it/s, loss=0.192, lr=1] Steps: 55%|█████▍ | 821/1500 [08:51<07:17, 1.55it/s, loss=0.244, lr=1] Steps: 55%|█████▍ | 822/1500 [08:52<07:16, 1.55it/s, loss=0.244, lr=1] Steps: 55%|█████▍ | 822/1500 [08:52<07:16, 1.55it/s, loss=0.0713, lr=1] Steps: 55%|█████▍ | 823/1500 [08:53<07:16, 1.55it/s, loss=0.0713, lr=1] Steps: 55%|█████▍ | 823/1500 [08:53<07:16, 1.55it/s, loss=0.117, lr=1] Steps: 55%|█████▍ | 824/1500 [08:53<07:16, 1.55it/s, loss=0.117, lr=1] Steps: 55%|█████▍ | 824/1500 [08:53<07:16, 1.55it/s, loss=0.151, lr=1] Steps: 55%|█████▌ | 825/1500 [08:54<07:15, 1.55it/s, loss=0.151, lr=1] Steps: 55%|█████▌ | 825/1500 [08:54<07:15, 1.55it/s, loss=0.154, lr=1] Steps: 55%|█████▌ | 826/1500 [08:55<07:15, 1.55it/s, loss=0.154, lr=1] Steps: 55%|█████▌ | 826/1500 [08:55<07:15, 1.55it/s, loss=0.206, lr=1] Steps: 55%|█████▌ | 827/1500 [08:55<07:14, 1.55it/s, loss=0.206, lr=1] Steps: 55%|█████▌ | 827/1500 [08:55<07:14, 1.55it/s, loss=0.103, lr=1] Steps: 55%|█████▌ | 828/1500 [08:56<07:13, 1.55it/s, loss=0.103, lr=1] Steps: 55%|█████▌ | 828/1500 [08:56<07:13, 1.55it/s, loss=0.0578, lr=1] Steps: 55%|█████▌ | 829/1500 [08:57<07:13, 1.55it/s, loss=0.0578, lr=1] Steps: 55%|█████▌ | 829/1500 [08:57<07:13, 1.55it/s, loss=0.282, lr=1] Steps: 55%|█████▌ | 830/1500 [08:57<07:12, 1.55it/s, loss=0.282, lr=1] Steps: 55%|█████▌ | 830/1500 [08:57<07:12, 1.55it/s, loss=0.145, lr=1] Steps: 55%|█████▌ | 831/1500 [08:58<07:12, 1.55it/s, loss=0.145, lr=1] Steps: 55%|█████▌ | 831/1500 [08:58<07:12, 1.55it/s, loss=0.126, lr=1] Steps: 55%|█████▌ | 832/1500 [08:59<07:11, 1.55it/s, loss=0.126, lr=1] Steps: 55%|█████▌ | 832/1500 [08:59<07:11, 1.55it/s, loss=0.154, lr=1] Steps: 56%|█████▌ | 833/1500 [08:59<07:13, 1.54it/s, loss=0.154, lr=1] Steps: 56%|█████▌ | 833/1500 [08:59<07:13, 1.54it/s, loss=0.12, lr=1] Steps: 56%|█████▌ | 834/1500 [09:00<07:11, 1.54it/s, loss=0.12, lr=1] Steps: 56%|█████▌ | 834/1500 [09:00<07:11, 1.54it/s, loss=0.11, lr=1] Steps: 56%|█████▌ | 835/1500 [09:01<07:10, 1.54it/s, loss=0.11, lr=1] Steps: 56%|█████▌ | 835/1500 [09:01<07:10, 1.54it/s, loss=0.104, lr=1] Steps: 56%|█████▌ | 836/1500 [09:01<07:09, 1.55it/s, loss=0.104, lr=1] Steps: 56%|█████▌ | 836/1500 [09:01<07:09, 1.55it/s, loss=0.151, lr=1] Steps: 56%|█████▌ | 837/1500 [09:02<07:08, 1.55it/s, loss=0.151, lr=1] Steps: 56%|█████▌ | 837/1500 [09:02<07:08, 1.55it/s, loss=0.196, lr=1] Steps: 56%|█████▌ | 838/1500 [09:02<07:07, 1.55it/s, loss=0.196, lr=1] Steps: 56%|█████▌ | 838/1500 [09:02<07:07, 1.55it/s, loss=0.0562, lr=1] Steps: 56%|█████▌ | 839/1500 [09:03<07:07, 1.55it/s, loss=0.0562, lr=1] Steps: 56%|█████▌ | 839/1500 [09:03<07:07, 1.55it/s, loss=0.137, lr=1] Steps: 56%|█████▌ | 840/1500 [09:04<07:07, 1.54it/s, loss=0.137, lr=1] Steps: 56%|█████▌ | 840/1500 [09:04<07:07, 1.54it/s, loss=0.0799, lr=1] Steps: 56%|█████▌ | 841/1500 [09:04<07:06, 1.55it/s, loss=0.0799, lr=1] Steps: 56%|█████▌ | 841/1500 [09:04<07:06, 1.55it/s, loss=0.0489, lr=1] Steps: 56%|█████▌ | 842/1500 [09:05<07:05, 1.55it/s, loss=0.0489, lr=1] Steps: 56%|█████▌ | 842/1500 [09:05<07:05, 1.55it/s, loss=0.0304, lr=1] Steps: 56%|█████▌ | 843/1500 [09:06<07:04, 1.55it/s, loss=0.0304, lr=1] Steps: 56%|█████▌ | 843/1500 [09:06<07:04, 1.55it/s, loss=0.119, lr=1] Steps: 56%|█████▋ | 844/1500 [09:06<07:04, 1.55it/s, loss=0.119, lr=1] Steps: 56%|█████▋ | 844/1500 [09:06<07:04, 1.55it/s, loss=0.102, lr=1] Steps: 56%|█████▋ | 845/1500 [09:07<07:03, 1.55it/s, loss=0.102, lr=1] Steps: 56%|█████▋ | 845/1500 [09:07<07:03, 1.55it/s, loss=0.163, lr=1] Steps: 56%|█████▋ | 846/1500 [09:08<07:02, 1.55it/s, loss=0.163, lr=1] Steps: 56%|█████▋ | 846/1500 [09:08<07:02, 1.55it/s, loss=0.111, lr=1] Steps: 56%|█████▋ | 847/1500 [09:08<07:02, 1.55it/s, loss=0.111, lr=1] Steps: 56%|█████▋ | 847/1500 [09:08<07:02, 1.55it/s, loss=0.0447, lr=1] Steps: 57%|█████▋ | 848/1500 [09:09<07:01, 1.55it/s, loss=0.0447, lr=1] Steps: 57%|█████▋ | 848/1500 [09:09<07:01, 1.55it/s, loss=0.0748, lr=1] Steps: 57%|█████▋ | 849/1500 [09:10<07:03, 1.54it/s, loss=0.0748, lr=1] Steps: 57%|█████▋ | 849/1500 [09:10<07:03, 1.54it/s, loss=0.164, lr=1] Steps: 57%|█████▋ | 850/1500 [09:10<07:02, 1.54it/s, loss=0.164, lr=1] Steps: 57%|█████▋ | 850/1500 [09:10<07:02, 1.54it/s, loss=0.119, lr=1] Steps: 57%|█████▋ | 851/1500 [09:11<07:01, 1.54it/s, loss=0.119, lr=1] Steps: 57%|█████▋ | 851/1500 [09:11<07:01, 1.54it/s, loss=0.0891, lr=1] Steps: 57%|█████▋ | 852/1500 [09:12<06:59, 1.54it/s, loss=0.0891, lr=1] Steps: 57%|█████▋ | 852/1500 [09:12<06:59, 1.54it/s, loss=0.134, lr=1] Steps: 57%|█████▋ | 853/1500 [09:12<06:58, 1.54it/s, loss=0.134, lr=1] Steps: 57%|█████▋ | 853/1500 [09:12<06:58, 1.54it/s, loss=0.131, lr=1] Steps: 57%|█████▋ | 854/1500 [09:13<06:57, 1.55it/s, loss=0.131, lr=1] Steps: 57%|█████▋ | 854/1500 [09:13<06:57, 1.55it/s, loss=0.019, lr=1] Steps: 57%|█████▋ | 855/1500 [09:13<06:56, 1.55it/s, loss=0.019, lr=1] Steps: 57%|█████▋ | 855/1500 [09:13<06:56, 1.55it/s, loss=0.0788, lr=1] Steps: 57%|█████▋ | 856/1500 [09:14<06:55, 1.55it/s, loss=0.0788, lr=1] Steps: 57%|█████▋ | 856/1500 [09:14<06:55, 1.55it/s, loss=0.0444, lr=1] Steps: 57%|█████▋ | 857/1500 [09:15<06:54, 1.55it/s, loss=0.0444, lr=1] Steps: 57%|█████▋ | 857/1500 [09:15<06:54, 1.55it/s, loss=0.307, lr=1] Steps: 57%|█████▋ | 858/1500 [09:15<06:53, 1.55it/s, loss=0.307, lr=1] Steps: 57%|█████▋ | 858/1500 [09:15<06:53, 1.55it/s, loss=0.106, lr=1] Steps: 57%|█████▋ | 859/1500 [09:16<06:52, 1.55it/s, loss=0.106, lr=1] Steps: 57%|█████▋ | 859/1500 [09:16<06:52, 1.55it/s, loss=0.231, lr=1] Steps: 57%|█████▋ | 860/1500 [09:17<06:52, 1.55it/s, loss=0.231, lr=1] Steps: 57%|█████▋ | 860/1500 [09:17<06:52, 1.55it/s, loss=0.0339, lr=1] Steps: 57%|█████▋ | 861/1500 [09:17<06:51, 1.55it/s, loss=0.0339, lr=1] Steps: 57%|█████▋ | 861/1500 [09:17<06:51, 1.55it/s, loss=0.143, lr=1] Steps: 57%|█████▋ | 862/1500 [09:18<06:51, 1.55it/s, loss=0.143, lr=1] Steps: 57%|█████▋ | 862/1500 [09:18<06:51, 1.55it/s, loss=0.102, lr=1] Steps: 58%|█████▊ | 863/1500 [09:19<06:51, 1.55it/s, loss=0.102, lr=1] Steps: 58%|█████▊ | 863/1500 [09:19<06:51, 1.55it/s, loss=0.0576, lr=1] Steps: 58%|█████▊ | 864/1500 [09:19<06:50, 1.55it/s, loss=0.0576, lr=1] Steps: 58%|█████▊ | 864/1500 [09:19<06:50, 1.55it/s, loss=0.209, lr=1] Steps: 58%|█████▊ | 865/1500 [09:20<06:51, 1.54it/s, loss=0.209, lr=1] Steps: 58%|█████▊ | 865/1500 [09:20<06:51, 1.54it/s, loss=0.0574, lr=1] Steps: 58%|█████▊ | 866/1500 [09:21<06:50, 1.55it/s, loss=0.0574, lr=1] Steps: 58%|█████▊ | 866/1500 [09:21<06:50, 1.55it/s, loss=0.198, lr=1] Steps: 58%|█████▊ | 867/1500 [09:21<06:48, 1.55it/s, loss=0.198, lr=1] Steps: 58%|█████▊ | 867/1500 [09:21<06:48, 1.55it/s, loss=0.0527, lr=1] Steps: 58%|█████▊ | 868/1500 [09:22<06:47, 1.55it/s, loss=0.0527, lr=1] Steps: 58%|█████▊ | 868/1500 [09:22<06:47, 1.55it/s, loss=0.0919, lr=1] Steps: 58%|█████▊ | 869/1500 [09:22<06:46, 1.55it/s, loss=0.0919, lr=1] Steps: 58%|█████▊ | 869/1500 [09:22<06:46, 1.55it/s, loss=0.137, lr=1] Steps: 58%|█████▊ | 870/1500 [09:23<06:46, 1.55it/s, loss=0.137, lr=1] Steps: 58%|█████▊ | 870/1500 [09:23<06:46, 1.55it/s, loss=0.404, lr=1] Steps: 58%|█████▊ | 871/1500 [09:24<06:45, 1.55it/s, loss=0.404, lr=1] Steps: 58%|█████▊ | 871/1500 [09:24<06:45, 1.55it/s, loss=0.0612, lr=1] Steps: 58%|█████▊ | 872/1500 [09:24<06:45, 1.55it/s, loss=0.0612, lr=1] Steps: 58%|█████▊ | 872/1500 [09:24<06:45, 1.55it/s, loss=0.104, lr=1] Steps: 58%|█████▊ | 873/1500 [09:25<06:44, 1.55it/s, loss=0.104, lr=1] Steps: 58%|█████▊ | 873/1500 [09:25<06:44, 1.55it/s, loss=0.182, lr=1] Steps: 58%|█████▊ | 874/1500 [09:26<06:44, 1.55it/s, loss=0.182, lr=1] Steps: 58%|█████▊ | 874/1500 [09:26<06:44, 1.55it/s, loss=0.103, lr=1] Steps: 58%|█████▊ | 875/1500 [09:26<06:43, 1.55it/s, loss=0.103, lr=1] Steps: 58%|█████▊ | 875/1500 [09:26<06:43, 1.55it/s, loss=0.134, lr=1] Steps: 58%|█████▊ | 876/1500 [09:27<06:42, 1.55it/s, loss=0.134, lr=1] Steps: 58%|█████▊ | 876/1500 [09:27<06:42, 1.55it/s, loss=0.0465, lr=1] Steps: 58%|█████▊ | 877/1500 [09:28<06:42, 1.55it/s, loss=0.0465, lr=1] Steps: 58%|█████▊ | 877/1500 [09:28<06:42, 1.55it/s, loss=0.175, lr=1] Steps: 59%|█████▊ | 878/1500 [09:28<06:41, 1.55it/s, loss=0.175, lr=1] Steps: 59%|█████▊ | 878/1500 [09:28<06:41, 1.55it/s, loss=0.169, lr=1] Steps: 59%|█████▊ | 879/1500 [09:29<06:40, 1.55it/s, loss=0.169, lr=1] Steps: 59%|█████▊ | 879/1500 [09:29<06:40, 1.55it/s, loss=0.082, lr=1] Steps: 59%|█████▊ | 880/1500 [09:30<06:40, 1.55it/s, loss=0.082, lr=1] Steps: 59%|█████▊ | 880/1500 [09:30<06:40, 1.55it/s, loss=0.101, lr=1] Steps: 59%|█████▊ | 881/1500 [09:30<06:41, 1.54it/s, loss=0.101, lr=1] Steps: 59%|█████▊ | 881/1500 [09:30<06:41, 1.54it/s, loss=0.0826, lr=1] Steps: 59%|█████▉ | 882/1500 [09:31<06:41, 1.54it/s, loss=0.0826, lr=1] Steps: 59%|█████▉ | 882/1500 [09:31<06:41, 1.54it/s, loss=0.189, lr=1] Steps: 59%|█████▉ | 883/1500 [09:32<06:39, 1.54it/s, loss=0.189, lr=1] Steps: 59%|█████▉ | 883/1500 [09:32<06:39, 1.54it/s, loss=0.0935, lr=1] Steps: 59%|█████▉ | 884/1500 [09:32<06:38, 1.55it/s, loss=0.0935, lr=1] Steps: 59%|█████▉ | 884/1500 [09:32<06:38, 1.55it/s, loss=0.173, lr=1] Steps: 59%|█████▉ | 885/1500 [09:33<06:37, 1.55it/s, loss=0.173, lr=1] Steps: 59%|█████▉ | 885/1500 [09:33<06:37, 1.55it/s, loss=0.262, lr=1] Steps: 59%|█████▉ | 886/1500 [09:33<06:36, 1.55it/s, loss=0.262, lr=1] Steps: 59%|█████▉ | 886/1500 [09:33<06:36, 1.55it/s, loss=0.272, lr=1] Steps: 59%|█████▉ | 887/1500 [09:34<06:35, 1.55it/s, loss=0.272, lr=1] Steps: 59%|█████▉ | 887/1500 [09:34<06:35, 1.55it/s, loss=0.134, lr=1] Steps: 59%|█████▉ | 888/1500 [09:35<06:35, 1.55it/s, loss=0.134, lr=1] Steps: 59%|█████▉ | 888/1500 [09:35<06:35, 1.55it/s, loss=0.071, lr=1] Steps: 59%|█████▉ | 889/1500 [09:35<06:34, 1.55it/s, loss=0.071, lr=1] Steps: 59%|█████▉ | 889/1500 [09:35<06:34, 1.55it/s, loss=0.105, lr=1] Steps: 59%|█████▉ | 890/1500 [09:36<06:32, 1.55it/s, loss=0.105, lr=1] Steps: 59%|█████▉ | 890/1500 [09:36<06:32, 1.55it/s, loss=0.0701, lr=1] Steps: 59%|█████▉ | 891/1500 [09:37<06:33, 1.55it/s, loss=0.0701, lr=1] Steps: 59%|█████▉ | 891/1500 [09:37<06:33, 1.55it/s, loss=0.0797, lr=1] Steps: 59%|█████▉ | 892/1500 [09:37<06:32, 1.55it/s, loss=0.0797, lr=1] Steps: 59%|█████▉ | 892/1500 [09:37<06:32, 1.55it/s, loss=0.132, lr=1] Steps: 60%|█████▉ | 893/1500 [09:38<06:31, 1.55it/s, loss=0.132, lr=1] Steps: 60%|█████▉ | 893/1500 [09:38<06:31, 1.55it/s, loss=0.0866, lr=1] Steps: 60%|█████▉ | 894/1500 [09:39<06:30, 1.55it/s, loss=0.0866, lr=1] Steps: 60%|█████▉ | 894/1500 [09:39<06:30, 1.55it/s, loss=0.0557, lr=1] Steps: 60%|█████▉ | 895/1500 [09:39<06:30, 1.55it/s, loss=0.0557, lr=1] Steps: 60%|█████▉ | 895/1500 [09:39<06:30, 1.55it/s, loss=0.143, lr=1] Steps: 60%|█████▉ | 896/1500 [09:40<06:29, 1.55it/s, loss=0.143, lr=1] Steps: 60%|█████▉ | 896/1500 [09:40<06:29, 1.55it/s, loss=0.104, lr=1] Steps: 60%|█████▉ | 897/1500 [09:41<06:30, 1.54it/s, loss=0.104, lr=1] Steps: 60%|█████▉ | 897/1500 [09:41<06:30, 1.54it/s, loss=0.0695, lr=1] Steps: 60%|█████▉ | 898/1500 [09:41<06:29, 1.55it/s, loss=0.0695, lr=1] Steps: 60%|█████▉ | 898/1500 [09:41<06:29, 1.55it/s, loss=0.134, lr=1] Steps: 60%|█████▉ | 899/1500 [09:42<06:28, 1.55it/s, loss=0.134, lr=1] Steps: 60%|█████▉ | 899/1500 [09:42<06:28, 1.55it/s, loss=0.0307, lr=1] Steps: 60%|██████ | 900/1500 [09:43<06:28, 1.55it/s, loss=0.0307, lr=1] Steps: 60%|██████ | 900/1500 [09:43<06:28, 1.55it/s, loss=0.0581, lr=1] Steps: 60%|██████ | 901/1500 [09:43<06:27, 1.55it/s, loss=0.0581, lr=1] Steps: 60%|██████ | 901/1500 [09:43<06:27, 1.55it/s, loss=0.193, lr=1] Steps: 60%|██████ | 902/1500 [09:44<06:27, 1.54it/s, loss=0.193, lr=1] Steps: 60%|██████ | 902/1500 [09:44<06:27, 1.54it/s, loss=0.101, lr=1] Steps: 60%|██████ | 903/1500 [09:44<06:26, 1.55it/s, loss=0.101, lr=1] Steps: 60%|██████ | 903/1500 [09:44<06:26, 1.55it/s, loss=0.152, lr=1] Steps: 60%|██████ | 904/1500 [09:45<06:25, 1.55it/s, loss=0.152, lr=1] Steps: 60%|██████ | 904/1500 [09:45<06:25, 1.55it/s, loss=0.0389, lr=1] Steps: 60%|██████ | 905/1500 [09:46<06:24, 1.55it/s, loss=0.0389, lr=1] Steps: 60%|██████ | 905/1500 [09:46<06:24, 1.55it/s, loss=0.17, lr=1] Steps: 60%|██████ | 906/1500 [09:46<06:23, 1.55it/s, loss=0.17, lr=1] Steps: 60%|██████ | 906/1500 [09:46<06:23, 1.55it/s, loss=0.135, lr=1] Steps: 60%|██████ | 907/1500 [09:47<06:22, 1.55it/s, loss=0.135, lr=1] Steps: 60%|██████ | 907/1500 [09:47<06:22, 1.55it/s, loss=0.0728, lr=1] Steps: 61%|██████ | 908/1500 [09:48<06:21, 1.55it/s, loss=0.0728, lr=1] Steps: 61%|██████ | 908/1500 [09:48<06:21, 1.55it/s, loss=0.146, lr=1] Steps: 61%|██████ | 909/1500 [09:48<06:21, 1.55it/s, loss=0.146, lr=1] Steps: 61%|██████ | 909/1500 [09:48<06:21, 1.55it/s, loss=0.113, lr=1] Steps: 61%|██████ | 910/1500 [09:49<06:20, 1.55it/s, loss=0.113, lr=1] Steps: 61%|██████ | 910/1500 [09:49<06:20, 1.55it/s, loss=0.0995, lr=1] Steps: 61%|██████ | 911/1500 [09:50<06:20, 1.55it/s, loss=0.0995, lr=1] Steps: 61%|██████ | 911/1500 [09:50<06:20, 1.55it/s, loss=0.115, lr=1] Steps: 61%|██████ | 912/1500 [09:50<06:19, 1.55it/s, loss=0.115, lr=1] Steps: 61%|██████ | 912/1500 [09:50<06:19, 1.55it/s, loss=0.0825, lr=1] Steps: 61%|██████ | 913/1500 [09:51<06:21, 1.54it/s, loss=0.0825, lr=1] Steps: 61%|██████ | 913/1500 [09:51<06:21, 1.54it/s, loss=0.102, lr=1] Steps: 61%|██████ | 914/1500 [09:52<06:20, 1.54it/s, loss=0.102, lr=1] Steps: 61%|██████ | 914/1500 [09:52<06:20, 1.54it/s, loss=0.182, lr=1] Steps: 61%|██████ | 915/1500 [09:52<06:19, 1.54it/s, loss=0.182, lr=1] Steps: 61%|██████ | 915/1500 [09:52<06:19, 1.54it/s, loss=0.0937, lr=1] Steps: 61%|██████ | 916/1500 [09:53<06:18, 1.54it/s, loss=0.0937, lr=1] Steps: 61%|██████ | 916/1500 [09:53<06:18, 1.54it/s, loss=0.159, lr=1] Steps: 61%|██████ | 917/1500 [09:54<06:17, 1.54it/s, loss=0.159, lr=1] Steps: 61%|██████ | 917/1500 [09:54<06:17, 1.54it/s, loss=0.0698, lr=1] Steps: 61%|██████ | 918/1500 [09:54<06:17, 1.54it/s, loss=0.0698, lr=1] Steps: 61%|██████ | 918/1500 [09:54<06:17, 1.54it/s, loss=0.195, lr=1] Steps: 61%|██████▏ | 919/1500 [09:55<06:16, 1.54it/s, loss=0.195, lr=1] Steps: 61%|██████▏ | 919/1500 [09:55<06:16, 1.54it/s, loss=0.0995, lr=1] Steps: 61%|██████▏ | 920/1500 [09:55<06:15, 1.54it/s, loss=0.0995, lr=1] Steps: 61%|██████▏ | 920/1500 [09:55<06:15, 1.54it/s, loss=0.171, lr=1] Steps: 61%|██████▏ | 921/1500 [09:56<06:14, 1.54it/s, loss=0.171, lr=1] Steps: 61%|██████▏ | 921/1500 [09:56<06:14, 1.54it/s, loss=0.171, lr=1] Steps: 61%|██████▏ | 922/1500 [09:57<06:14, 1.54it/s, loss=0.171, lr=1] Steps: 61%|██████▏ | 922/1500 [09:57<06:14, 1.54it/s, loss=0.147, lr=1] Steps: 62%|██████▏ | 923/1500 [09:57<06:13, 1.55it/s, loss=0.147, lr=1] Steps: 62%|██████▏ | 923/1500 [09:57<06:13, 1.55it/s, loss=0.234, lr=1] Steps: 62%|██████▏ | 924/1500 [09:58<06:12, 1.55it/s, loss=0.234, lr=1] Steps: 62%|██████▏ | 924/1500 [09:58<06:12, 1.55it/s, loss=0.106, lr=1] Steps: 62%|██████▏ | 925/1500 [09:59<06:11, 1.55it/s, loss=0.106, lr=1] Steps: 62%|██████▏ | 925/1500 [09:59<06:11, 1.55it/s, loss=0.122, lr=1] Steps: 62%|██████▏ | 926/1500 [09:59<06:10, 1.55it/s, loss=0.122, lr=1] Steps: 62%|██████▏ | 926/1500 [09:59<06:10, 1.55it/s, loss=0.134, lr=1] Steps: 62%|██████▏ | 927/1500 [10:00<06:10, 1.55it/s, loss=0.134, lr=1] Steps: 62%|██████▏ | 927/1500 [10:00<06:10, 1.55it/s, loss=0.126, lr=1] Steps: 62%|██████▏ | 928/1500 [10:01<06:09, 1.55it/s, loss=0.126, lr=1] Steps: 62%|██████▏ | 928/1500 [10:01<06:09, 1.55it/s, loss=0.0785, lr=1] Steps: 62%|██████▏ | 929/1500 [10:01<06:11, 1.54it/s, loss=0.0785, lr=1] Steps: 62%|██████▏ | 929/1500 [10:01<06:11, 1.54it/s, loss=0.142, lr=1] Steps: 62%|██████▏ | 930/1500 [10:02<06:10, 1.54it/s, loss=0.142, lr=1] Steps: 62%|██████▏ | 930/1500 [10:02<06:10, 1.54it/s, loss=0.149, lr=1] Steps: 62%|██████▏ | 931/1500 [10:03<06:08, 1.54it/s, loss=0.149, lr=1] Steps: 62%|██████▏ | 931/1500 [10:03<06:08, 1.54it/s, loss=0.157, lr=1] Steps: 62%|██████▏ | 932/1500 [10:03<06:08, 1.54it/s, loss=0.157, lr=1] Steps: 62%|██████▏ | 932/1500 [10:03<06:08, 1.54it/s, loss=0.173, lr=1] Steps: 62%|██████▏ | 933/1500 [10:04<06:07, 1.54it/s, loss=0.173, lr=1] Steps: 62%|██████▏ | 933/1500 [10:04<06:07, 1.54it/s, loss=0.0993, lr=1] Steps: 62%|██████▏ | 934/1500 [10:05<06:06, 1.55it/s, loss=0.0993, lr=1] Steps: 62%|██████▏ | 934/1500 [10:05<06:06, 1.55it/s, loss=0.106, lr=1] Steps: 62%|██████▏ | 935/1500 [10:05<06:05, 1.55it/s, loss=0.106, lr=1] Steps: 62%|██████▏ | 935/1500 [10:05<06:05, 1.55it/s, loss=0.224, lr=1] Steps: 62%|██████▏ | 936/1500 [10:06<06:05, 1.54it/s, loss=0.224, lr=1] Steps: 62%|██████▏ | 936/1500 [10:06<06:05, 1.54it/s, loss=0.0994, lr=1] Steps: 62%|██████▏ | 937/1500 [10:07<06:26, 1.46it/s, loss=0.0994, lr=1] Steps: 62%|██████▏ | 937/1500 [10:07<06:26, 1.46it/s, loss=0.118, lr=1] Steps: 63%|██████▎ | 938/1500 [10:07<06:19, 1.48it/s, loss=0.118, lr=1] Steps: 63%|██████▎ | 938/1500 [10:07<06:19, 1.48it/s, loss=0.133, lr=1] Steps: 63%|██████▎ | 939/1500 [10:08<06:14, 1.50it/s, loss=0.133, lr=1] Steps: 63%|██████▎ | 939/1500 [10:08<06:14, 1.50it/s, loss=0.308, lr=1] Steps: 63%|██████▎ | 940/1500 [10:09<06:09, 1.52it/s, loss=0.308, lr=1] Steps: 63%|██████▎ | 940/1500 [10:09<06:09, 1.52it/s, loss=0.133, lr=1] Steps: 63%|██████▎ | 941/1500 [10:09<06:06, 1.53it/s, loss=0.133, lr=1] Steps: 63%|██████▎ | 941/1500 [10:09<06:06, 1.53it/s, loss=0.132, lr=1] Steps: 63%|██████▎ | 942/1500 [10:10<06:04, 1.53it/s, loss=0.132, lr=1] Steps: 63%|██████▎ | 942/1500 [10:10<06:04, 1.53it/s, loss=0.0924, lr=1] Steps: 63%|██████▎ | 943/1500 [10:10<06:02, 1.54it/s, loss=0.0924, lr=1] Steps: 63%|██████▎ | 943/1500 [10:10<06:02, 1.54it/s, loss=0.0814, lr=1] Steps: 63%|██████▎ | 944/1500 [10:11<06:00, 1.54it/s, loss=0.0814, lr=1] Steps: 63%|██████▎ | 944/1500 [10:11<06:00, 1.54it/s, loss=0.0831, lr=1] Steps: 63%|██████▎ | 945/1500 [10:12<06:02, 1.53it/s, loss=0.0831, lr=1] Steps: 63%|██████▎ | 945/1500 [10:12<06:02, 1.53it/s, loss=0.0933, lr=1] Steps: 63%|██████▎ | 946/1500 [10:12<06:00, 1.54it/s, loss=0.0933, lr=1] Steps: 63%|██████▎ | 946/1500 [10:12<06:00, 1.54it/s, loss=0.139, lr=1] Steps: 63%|██████▎ | 947/1500 [10:13<05:58, 1.54it/s, loss=0.139, lr=1] Steps: 63%|██████▎ | 947/1500 [10:13<05:58, 1.54it/s, loss=0.0964, lr=1] Steps: 63%|██████▎ | 948/1500 [10:14<05:57, 1.54it/s, loss=0.0964, lr=1] Steps: 63%|██████▎ | 948/1500 [10:14<05:57, 1.54it/s, loss=0.0664, lr=1] Steps: 63%|██████▎ | 949/1500 [10:14<05:56, 1.54it/s, loss=0.0664, lr=1] Steps: 63%|██████▎ | 949/1500 [10:14<05:56, 1.54it/s, loss=0.108, lr=1] Steps: 63%|██████▎ | 950/1500 [10:15<05:55, 1.55it/s, loss=0.108, lr=1] Steps: 63%|██████▎ | 950/1500 [10:15<05:55, 1.55it/s, loss=0.0834, lr=1] Steps: 63%|██████▎ | 951/1500 [10:16<05:54, 1.55it/s, loss=0.0834, lr=1] Steps: 63%|██████▎ | 951/1500 [10:16<05:54, 1.55it/s, loss=0.109, lr=1] Steps: 63%|██████▎ | 952/1500 [10:16<05:54, 1.54it/s, loss=0.109, lr=1] Steps: 63%|██████▎ | 952/1500 [10:16<05:54, 1.54it/s, loss=0.218, lr=1] Steps: 64%|██████▎ | 953/1500 [10:17<05:53, 1.55it/s, loss=0.218, lr=1] Steps: 64%|██████▎ | 953/1500 [10:17<05:53, 1.55it/s, loss=0.0897, lr=1] Steps: 64%|██████▎ | 954/1500 [10:18<05:52, 1.55it/s, loss=0.0897, lr=1] Steps: 64%|██████▎ | 954/1500 [10:18<05:52, 1.55it/s, loss=0.0709, lr=1] Steps: 64%|██████▎ | 955/1500 [10:18<05:52, 1.55it/s, loss=0.0709, lr=1] Steps: 64%|██████▎ | 955/1500 [10:18<05:52, 1.55it/s, loss=0.223, lr=1] Steps: 64%|██████▎ | 956/1500 [10:19<05:51, 1.55it/s, loss=0.223, lr=1] Steps: 64%|██████▎ | 956/1500 [10:19<05:51, 1.55it/s, loss=0.14, lr=1] Steps: 64%|██████▍ | 957/1500 [10:20<05:50, 1.55it/s, loss=0.14, lr=1] Steps: 64%|██████▍ | 957/1500 [10:20<05:50, 1.55it/s, loss=0.14, lr=1] Steps: 64%|██████▍ | 958/1500 [10:20<05:49, 1.55it/s, loss=0.14, lr=1] Steps: 64%|██████▍ | 958/1500 [10:20<05:49, 1.55it/s, loss=0.118, lr=1] Steps: 64%|██████▍ | 959/1500 [10:21<05:49, 1.55it/s, loss=0.118, lr=1] Steps: 64%|██████▍ | 959/1500 [10:21<05:49, 1.55it/s, loss=0.136, lr=1] Steps: 64%|██████▍ | 960/1500 [10:21<05:48, 1.55it/s, loss=0.136, lr=1] Steps: 64%|██████▍ | 960/1500 [10:21<05:48, 1.55it/s, loss=0.0722, lr=1] Steps: 64%|██████▍ | 961/1500 [10:22<05:49, 1.54it/s, loss=0.0722, lr=1] Steps: 64%|██████▍ | 961/1500 [10:22<05:49, 1.54it/s, loss=0.0631, lr=1] Steps: 64%|██████▍ | 962/1500 [10:23<05:48, 1.54it/s, loss=0.0631, lr=1] Steps: 64%|██████▍ | 962/1500 [10:23<05:48, 1.54it/s, loss=0.118, lr=1] Steps: 64%|██████▍ | 963/1500 [10:23<05:47, 1.55it/s, loss=0.118, lr=1] Steps: 64%|██████▍ | 963/1500 [10:23<05:47, 1.55it/s, loss=0.0361, lr=1] Steps: 64%|██████▍ | 964/1500 [10:24<05:46, 1.55it/s, loss=0.0361, lr=1] Steps: 64%|██████▍ | 964/1500 [10:24<05:46, 1.55it/s, loss=0.137, lr=1] Steps: 64%|██████▍ | 965/1500 [10:25<05:45, 1.55it/s, loss=0.137, lr=1] Steps: 64%|██████▍ | 965/1500 [10:25<05:45, 1.55it/s, loss=0.116, lr=1] Steps: 64%|██████▍ | 966/1500 [10:25<05:44, 1.55it/s, loss=0.116, lr=1] Steps: 64%|██████▍ | 966/1500 [10:25<05:44, 1.55it/s, loss=0.124, lr=1] Steps: 64%|██████▍ | 967/1500 [10:26<05:43, 1.55it/s, loss=0.124, lr=1] Steps: 64%|██████▍ | 967/1500 [10:26<05:43, 1.55it/s, loss=0.0765, lr=1] Steps: 65%|██████▍ | 968/1500 [10:27<05:43, 1.55it/s, loss=0.0765, lr=1] Steps: 65%|██████▍ | 968/1500 [10:27<05:43, 1.55it/s, loss=0.117, lr=1] Steps: 65%|██████▍ | 969/1500 [10:27<05:42, 1.55it/s, loss=0.117, lr=1] Steps: 65%|██████▍ | 969/1500 [10:27<05:42, 1.55it/s, loss=0.152, lr=1] Steps: 65%|██████▍ | 970/1500 [10:28<05:41, 1.55it/s, loss=0.152, lr=1] Steps: 65%|██████▍ | 970/1500 [10:28<05:41, 1.55it/s, loss=0.0291, lr=1] Steps: 65%|██████▍ | 971/1500 [10:29<05:41, 1.55it/s, loss=0.0291, lr=1] Steps: 65%|██████▍ | 971/1500 [10:29<05:41, 1.55it/s, loss=0.268, lr=1] Steps: 65%|██████▍ | 972/1500 [10:29<05:41, 1.55it/s, loss=0.268, lr=1] Steps: 65%|██████▍ | 972/1500 [10:29<05:41, 1.55it/s, loss=0.134, lr=1] Steps: 65%|██████▍ | 973/1500 [10:30<05:40, 1.55it/s, loss=0.134, lr=1] Steps: 65%|██████▍ | 973/1500 [10:30<05:40, 1.55it/s, loss=0.145, lr=1] Steps: 65%|██████▍ | 974/1500 [10:31<05:39, 1.55it/s, loss=0.145, lr=1] Steps: 65%|██████▍ | 974/1500 [10:31<05:39, 1.55it/s, loss=0.071, lr=1] Steps: 65%|██████▌ | 975/1500 [10:31<05:38, 1.55it/s, loss=0.071, lr=1] Steps: 65%|██████▌ | 975/1500 [10:31<05:38, 1.55it/s, loss=0.0721, lr=1] Steps: 65%|██████▌ | 976/1500 [10:32<05:38, 1.55it/s, loss=0.0721, lr=1] Steps: 65%|██████▌ | 976/1500 [10:32<05:38, 1.55it/s, loss=0.0855, lr=1] Steps: 65%|██████▌ | 977/1500 [10:32<05:39, 1.54it/s, loss=0.0855, lr=1] Steps: 65%|██████▌ | 977/1500 [10:32<05:39, 1.54it/s, loss=0.0741, lr=1] Steps: 65%|██████▌ | 978/1500 [10:33<05:37, 1.55it/s, loss=0.0741, lr=1] Steps: 65%|██████▌ | 978/1500 [10:33<05:37, 1.55it/s, loss=0.109, lr=1] Steps: 65%|██████▌ | 979/1500 [10:34<05:36, 1.55it/s, loss=0.109, lr=1] Steps: 65%|██████▌ | 979/1500 [10:34<05:36, 1.55it/s, loss=0.0505, lr=1] Steps: 65%|██████▌ | 980/1500 [10:34<05:36, 1.55it/s, loss=0.0505, lr=1] Steps: 65%|██████▌ | 980/1500 [10:34<05:36, 1.55it/s, loss=0.0594, lr=1] Steps: 65%|██████▌ | 981/1500 [10:35<05:35, 1.55it/s, loss=0.0594, lr=1] Steps: 65%|██████▌ | 981/1500 [10:35<05:35, 1.55it/s, loss=0.21, lr=1] Steps: 65%|██████▌ | 982/1500 [10:36<05:34, 1.55it/s, loss=0.21, lr=1] Steps: 65%|██████▌ | 982/1500 [10:36<05:34, 1.55it/s, loss=0.184, lr=1] Steps: 66%|██████▌ | 983/1500 [10:36<05:33, 1.55it/s, loss=0.184, lr=1] Steps: 66%|██████▌ | 983/1500 [10:36<05:33, 1.55it/s, loss=0.152, lr=1] Steps: 66%|██████▌ | 984/1500 [10:37<05:32, 1.55it/s, loss=0.152, lr=1] Steps: 66%|██████▌ | 984/1500 [10:37<05:32, 1.55it/s, loss=0.0604, lr=1] Steps: 66%|██████▌ | 985/1500 [10:38<06:15, 1.37it/s, loss=0.0604, lr=1] Steps: 66%|██████▌ | 985/1500 [10:38<06:15, 1.37it/s, loss=0.116, lr=1] Steps: 66%|██████▌ | 986/1500 [10:39<06:01, 1.42it/s, loss=0.116, lr=1] Steps: 66%|██████▌ | 986/1500 [10:39<06:01, 1.42it/s, loss=0.169, lr=1] Steps: 66%|██████▌ | 987/1500 [10:39<05:51, 1.46it/s, loss=0.169, lr=1] Steps: 66%|██████▌ | 987/1500 [10:39<05:51, 1.46it/s, loss=0.129, lr=1] Steps: 66%|██████▌ | 988/1500 [10:40<05:44, 1.49it/s, loss=0.129, lr=1] Steps: 66%|██████▌ | 988/1500 [10:40<05:44, 1.49it/s, loss=0.106, lr=1] Steps: 66%|██████▌ | 989/1500 [10:40<05:39, 1.51it/s, loss=0.106, lr=1] Steps: 66%|██████▌ | 989/1500 [10:40<05:39, 1.51it/s, loss=0.117, lr=1] Steps: 66%|██████▌ | 990/1500 [10:41<05:35, 1.52it/s, loss=0.117, lr=1] Steps: 66%|██████▌ | 990/1500 [10:41<05:35, 1.52it/s, loss=0.176, lr=1] Steps: 66%|██████▌ | 991/1500 [10:42<05:33, 1.53it/s, loss=0.176, lr=1] Steps: 66%|██████▌ | 991/1500 [10:42<05:33, 1.53it/s, loss=0.0918, lr=1] Steps: 66%|██████▌ | 992/1500 [10:42<05:30, 1.53it/s, loss=0.0918, lr=1] Steps: 66%|██████▌ | 992/1500 [10:42<05:30, 1.53it/s, loss=0.0964, lr=1] Steps: 66%|██████▌ | 993/1500 [10:43<05:31, 1.53it/s, loss=0.0964, lr=1] Steps: 66%|██████▌ | 993/1500 [10:43<05:31, 1.53it/s, loss=0.116, lr=1] Steps: 66%|██████▋ | 994/1500 [10:44<05:31, 1.53it/s, loss=0.116, lr=1] Steps: 66%|██████▋ | 994/1500 [10:44<05:31, 1.53it/s, loss=0.0747, lr=1] Steps: 66%|██████▋ | 995/1500 [10:44<05:29, 1.53it/s, loss=0.0747, lr=1] Steps: 66%|██████▋ | 995/1500 [10:44<05:29, 1.53it/s, loss=0.207, lr=1] Steps: 66%|██████▋ | 996/1500 [10:45<05:27, 1.54it/s, loss=0.207, lr=1] Steps: 66%|██████▋ | 996/1500 [10:45<05:27, 1.54it/s, loss=0.0351, lr=1] Steps: 66%|██████▋ | 997/1500 [10:46<05:26, 1.54it/s, loss=0.0351, lr=1] Steps: 66%|██████▋ | 997/1500 [10:46<05:26, 1.54it/s, loss=0.0997, lr=1] Steps: 67%|██████▋ | 998/1500 [10:46<05:25, 1.54it/s, loss=0.0997, lr=1] Steps: 67%|██████▋ | 998/1500 [10:46<05:25, 1.54it/s, loss=0.201, lr=1] Steps: 67%|██████▋ | 999/1500 [10:47<05:24, 1.54it/s, loss=0.201, lr=1] Steps: 67%|██████▋ | 999/1500 [10:47<05:24, 1.54it/s, loss=0.289, lr=1] Steps: 67%|██████▋ | 1000/1500 [10:48<05:23, 1.55it/s, loss=0.289, lr=1] Steps: 67%|██████▋ | 1000/1500 [10:48<05:23, 1.55it/s, loss=0.0773, lr=1] Steps: 67%|██████▋ | 1001/1500 [10:48<05:23, 1.54it/s, loss=0.0773, lr=1] Steps: 67%|██████▋ | 1001/1500 [10:48<05:23, 1.54it/s, loss=0.0843, lr=1] Steps: 67%|██████▋ | 1002/1500 [10:49<05:22, 1.54it/s, loss=0.0843, lr=1] Steps: 67%|██████▋ | 1002/1500 [10:49<05:22, 1.54it/s, loss=0.231, lr=1] Steps: 67%|██████▋ | 1003/1500 [10:50<05:21, 1.55it/s, loss=0.231, lr=1] Steps: 67%|██████▋ | 1003/1500 [10:50<05:21, 1.55it/s, loss=0.0999, lr=1] Steps: 67%|██████▋ | 1004/1500 [10:50<05:20, 1.55it/s, loss=0.0999, lr=1] Steps: 67%|██████▋ | 1004/1500 [10:50<05:20, 1.55it/s, loss=0.193, lr=1] Steps: 67%|██████▋ | 1005/1500 [10:51<05:19, 1.55it/s, loss=0.193, lr=1] Steps: 67%|██████▋ | 1005/1500 [10:51<05:19, 1.55it/s, loss=0.0955, lr=1] Steps: 67%|██████▋ | 1006/1500 [10:51<05:18, 1.55it/s, loss=0.0955, lr=1] Steps: 67%|██████▋ | 1006/1500 [10:51<05:18, 1.55it/s, loss=0.0954, lr=1] Steps: 67%|██████▋ | 1007/1500 [10:52<05:18, 1.55it/s, loss=0.0954, lr=1] Steps: 67%|██████▋ | 1007/1500 [10:52<05:18, 1.55it/s, loss=0.273, lr=1] Steps: 67%|██████▋ | 1008/1500 [10:53<05:17, 1.55it/s, loss=0.273, lr=1] Steps: 67%|██████▋ | 1008/1500 [10:53<05:17, 1.55it/s, loss=0.136, lr=1] Steps: 67%|██████▋ | 1009/1500 [10:53<05:18, 1.54it/s, loss=0.136, lr=1] Steps: 67%|██████▋ | 1009/1500 [10:53<05:18, 1.54it/s, loss=0.0223, lr=1] Steps: 67%|██████▋ | 1010/1500 [10:54<05:17, 1.54it/s, loss=0.0223, lr=1] Steps: 67%|██████▋ | 1010/1500 [10:54<05:17, 1.54it/s, loss=0.085, lr=1] Steps: 67%|██████▋ | 1011/1500 [10:55<05:17, 1.54it/s, loss=0.085, lr=1] Steps: 67%|██████▋ | 1011/1500 [10:55<05:17, 1.54it/s, loss=0.0704, lr=1] Steps: 67%|██████▋ | 1012/1500 [10:55<05:16, 1.54it/s, loss=0.0704, lr=1] Steps: 67%|██████▋ | 1012/1500 [10:55<05:16, 1.54it/s, loss=0.0788, lr=1] Steps: 68%|██████▊ | 1013/1500 [10:56<05:15, 1.54it/s, loss=0.0788, lr=1] Steps: 68%|██████▊ | 1013/1500 [10:56<05:15, 1.54it/s, loss=0.279, lr=1] Steps: 68%|██████▊ | 1014/1500 [10:57<05:14, 1.54it/s, loss=0.279, lr=1] Steps: 68%|██████▊ | 1014/1500 [10:57<05:14, 1.54it/s, loss=0.0669, lr=1] Steps: 68%|██████▊ | 1015/1500 [10:57<05:14, 1.54it/s, loss=0.0669, lr=1] Steps: 68%|██████▊ | 1015/1500 [10:57<05:14, 1.54it/s, loss=0.122, lr=1] Steps: 68%|██████▊ | 1016/1500 [10:58<05:13, 1.54it/s, loss=0.122, lr=1] Steps: 68%|██████▊ | 1016/1500 [10:58<05:13, 1.54it/s, loss=0.0945, lr=1] Steps: 68%|██████▊ | 1017/1500 [10:59<05:12, 1.55it/s, loss=0.0945, lr=1] Steps: 68%|██████▊ | 1017/1500 [10:59<05:12, 1.55it/s, loss=0.121, lr=1] Steps: 68%|██████▊ | 1018/1500 [10:59<05:11, 1.54it/s, loss=0.121, lr=1] Steps: 68%|██████▊ | 1018/1500 [10:59<05:11, 1.54it/s, loss=0.121, lr=1] Steps: 68%|██████▊ | 1019/1500 [11:00<05:11, 1.54it/s, loss=0.121, lr=1] Steps: 68%|██████▊ | 1019/1500 [11:00<05:11, 1.54it/s, loss=0.232, lr=1] Steps: 68%|██████▊ | 1020/1500 [11:01<05:10, 1.54it/s, loss=0.232, lr=1] Steps: 68%|██████▊ | 1020/1500 [11:01<05:10, 1.54it/s, loss=0.085, lr=1] Steps: 68%|██████▊ | 1021/1500 [11:01<05:10, 1.54it/s, loss=0.085, lr=1] Steps: 68%|██████▊ | 1021/1500 [11:01<05:10, 1.54it/s, loss=0.0912, lr=1] Steps: 68%|██████▊ | 1022/1500 [11:02<05:10, 1.54it/s, loss=0.0912, lr=1] Steps: 68%|██████▊ | 1022/1500 [11:02<05:10, 1.54it/s, loss=0.0651, lr=1] Steps: 68%|██████▊ | 1023/1500 [11:02<05:09, 1.54it/s, loss=0.0651, lr=1] Steps: 68%|██████▊ | 1023/1500 [11:02<05:09, 1.54it/s, loss=0.0371, lr=1] Steps: 68%|██████▊ | 1024/1500 [11:03<05:08, 1.54it/s, loss=0.0371, lr=1] Steps: 68%|██████▊ | 1024/1500 [11:03<05:08, 1.54it/s, loss=0.117, lr=1] Steps: 68%|██████▊ | 1025/1500 [11:04<05:10, 1.53it/s, loss=0.117, lr=1] Steps: 68%|██████▊ | 1025/1500 [11:04<05:10, 1.53it/s, loss=0.107, lr=1] Steps: 68%|██████▊ | 1026/1500 [11:04<05:09, 1.53it/s, loss=0.107, lr=1] Steps: 68%|██████▊ | 1026/1500 [11:04<05:09, 1.53it/s, loss=0.121, lr=1] Steps: 68%|██████▊ | 1027/1500 [11:05<05:07, 1.54it/s, loss=0.121, lr=1] Steps: 68%|██████▊ | 1027/1500 [11:05<05:07, 1.54it/s, loss=0.152, lr=1] Steps: 69%|██████▊ | 1028/1500 [11:06<05:06, 1.54it/s, loss=0.152, lr=1] Steps: 69%|██████▊ | 1028/1500 [11:06<05:06, 1.54it/s, loss=0.248, lr=1] Steps: 69%|██████▊ | 1029/1500 [11:06<05:05, 1.54it/s, loss=0.248, lr=1] Steps: 69%|██████▊ | 1029/1500 [11:06<05:05, 1.54it/s, loss=0.0565, lr=1] Steps: 69%|██████▊ | 1030/1500 [11:07<05:04, 1.54it/s, loss=0.0565, lr=1] Steps: 69%|██████▊ | 1030/1500 [11:07<05:04, 1.54it/s, loss=0.0352, lr=1] Steps: 69%|██████▊ | 1031/1500 [11:08<05:03, 1.55it/s, loss=0.0352, lr=1] Steps: 69%|██████▊ | 1031/1500 [11:08<05:03, 1.55it/s, loss=0.0997, lr=1] Steps: 69%|██████▉ | 1032/1500 [11:08<05:02, 1.55it/s, loss=0.0997, lr=1] Steps: 69%|██████▉ | 1032/1500 [11:08<05:02, 1.55it/s, loss=0.115, lr=1] Steps: 69%|██████▉ | 1033/1500 [11:09<05:01, 1.55it/s, loss=0.115, lr=1] Steps: 69%|██████▉ | 1033/1500 [11:09<05:01, 1.55it/s, loss=0.218, lr=1] Steps: 69%|██████▉ | 1034/1500 [11:10<05:01, 1.55it/s, loss=0.218, lr=1] Steps: 69%|██████▉ | 1034/1500 [11:10<05:01, 1.55it/s, loss=0.0188, lr=1] Steps: 69%|██████▉ | 1035/1500 [11:10<05:00, 1.55it/s, loss=0.0188, lr=1] Steps: 69%|██████▉ | 1035/1500 [11:10<05:00, 1.55it/s, loss=0.139, lr=1] Steps: 69%|██████▉ | 1036/1500 [11:11<04:59, 1.55it/s, loss=0.139, lr=1] Steps: 69%|██████▉ | 1036/1500 [11:11<04:59, 1.55it/s, loss=0.191, lr=1] Steps: 69%|██████▉ | 1037/1500 [11:12<04:59, 1.55it/s, loss=0.191, lr=1] Steps: 69%|██████▉ | 1037/1500 [11:12<04:59, 1.55it/s, loss=0.103, lr=1] Steps: 69%|██████▉ | 1038/1500 [11:12<04:59, 1.54it/s, loss=0.103, lr=1] Steps: 69%|██████▉ | 1038/1500 [11:12<04:59, 1.54it/s, loss=0.1, lr=1] Steps: 69%|██████▉ | 1039/1500 [11:13<04:58, 1.54it/s, loss=0.1, lr=1] Steps: 69%|██████▉ | 1039/1500 [11:13<04:58, 1.54it/s, loss=0.106, lr=1] Steps: 69%|██████▉ | 1040/1500 [11:13<04:57, 1.55it/s, loss=0.106, lr=1] Steps: 69%|██████▉ | 1040/1500 [11:14<04:57, 1.55it/s, loss=0.144, lr=1] Steps: 69%|██████▉ | 1041/1500 [11:14<04:58, 1.54it/s, loss=0.144, lr=1] Steps: 69%|██████▉ | 1041/1500 [11:14<04:58, 1.54it/s, loss=0.218, lr=1] Steps: 69%|██████▉ | 1042/1500 [11:15<04:57, 1.54it/s, loss=0.218, lr=1] Steps: 69%|██████▉ | 1042/1500 [11:15<04:57, 1.54it/s, loss=0.0844, lr=1] Steps: 70%|██████▉ | 1043/1500 [11:15<04:55, 1.54it/s, loss=0.0844, lr=1] Steps: 70%|██████▉ | 1043/1500 [11:15<04:55, 1.54it/s, loss=0.13, lr=1] Steps: 70%|██████▉ | 1044/1500 [11:16<04:54, 1.55it/s, loss=0.13, lr=1] Steps: 70%|██████▉ | 1044/1500 [11:16<04:54, 1.55it/s, loss=0.13, lr=1] Steps: 70%|██████▉ | 1045/1500 [11:17<04:54, 1.55it/s, loss=0.13, lr=1] Steps: 70%|██████▉ | 1045/1500 [11:17<04:54, 1.55it/s, loss=0.187, lr=1] Steps: 70%|██████▉ | 1046/1500 [11:17<04:53, 1.55it/s, loss=0.187, lr=1] Steps: 70%|██████▉ | 1046/1500 [11:17<04:53, 1.55it/s, loss=0.149, lr=1] Steps: 70%|██████▉ | 1047/1500 [11:18<04:52, 1.55it/s, loss=0.149, lr=1] Steps: 70%|██████▉ | 1047/1500 [11:18<04:52, 1.55it/s, loss=0.124, lr=1] Steps: 70%|██████▉ | 1048/1500 [11:19<04:51, 1.55it/s, loss=0.124, lr=1] Steps: 70%|██████▉ | 1048/1500 [11:19<04:51, 1.55it/s, loss=0.0894, lr=1] Steps: 70%|██████▉ | 1049/1500 [11:19<04:51, 1.55it/s, loss=0.0894, lr=1] Steps: 70%|██████▉ | 1049/1500 [11:19<04:51, 1.55it/s, loss=0.117, lr=1] Steps: 70%|███████ | 1050/1500 [11:20<04:50, 1.55it/s, loss=0.117, lr=1] Steps: 70%|███████ | 1050/1500 [11:20<04:50, 1.55it/s, loss=0.125, lr=1] Steps: 70%|███████ | 1051/1500 [11:21<04:50, 1.55it/s, loss=0.125, lr=1] Steps: 70%|███████ | 1051/1500 [11:21<04:50, 1.55it/s, loss=0.0965, lr=1] Steps: 70%|███████ | 1052/1500 [11:21<04:49, 1.55it/s, loss=0.0965, lr=1] Steps: 70%|███████ | 1052/1500 [11:21<04:49, 1.55it/s, loss=0.0396, lr=1] Steps: 70%|███████ | 1053/1500 [11:22<04:49, 1.55it/s, loss=0.0396, lr=1] Steps: 70%|███████ | 1053/1500 [11:22<04:49, 1.55it/s, loss=0.102, lr=1] Steps: 70%|███████ | 1054/1500 [11:23<04:48, 1.55it/s, loss=0.102, lr=1] Steps: 70%|███████ | 1054/1500 [11:23<04:48, 1.55it/s, loss=0.27, lr=1] Steps: 70%|███████ | 1055/1500 [11:23<04:47, 1.55it/s, loss=0.27, lr=1] Steps: 70%|███████ | 1055/1500 [11:23<04:47, 1.55it/s, loss=0.119, lr=1] Steps: 70%|███████ | 1056/1500 [11:24<04:47, 1.54it/s, loss=0.119, lr=1] Steps: 70%|███████ | 1056/1500 [11:24<04:47, 1.54it/s, loss=0.154, lr=1] Steps: 70%|███████ | 1057/1500 [11:25<04:48, 1.54it/s, loss=0.154, lr=1] Steps: 70%|███████ | 1057/1500 [11:25<04:48, 1.54it/s, loss=0.0516, lr=1] Steps: 71%|███████ | 1058/1500 [11:25<04:46, 1.54it/s, loss=0.0516, lr=1] Steps: 71%|███████ | 1058/1500 [11:25<04:46, 1.54it/s, loss=0.21, lr=1] Steps: 71%|███████ | 1059/1500 [11:26<04:45, 1.54it/s, loss=0.21, lr=1] Steps: 71%|███████ | 1059/1500 [11:26<04:45, 1.54it/s, loss=0.178, lr=1] Steps: 71%|███████ | 1060/1500 [11:26<04:44, 1.55it/s, loss=0.178, lr=1] Steps: 71%|███████ | 1060/1500 [11:26<04:44, 1.55it/s, loss=0.118, lr=1] Steps: 71%|███████ | 1061/1500 [11:27<04:43, 1.55it/s, loss=0.118, lr=1] Steps: 71%|███████ | 1061/1500 [11:27<04:43, 1.55it/s, loss=0.274, lr=1] Steps: 71%|███████ | 1062/1500 [11:28<04:42, 1.55it/s, loss=0.274, lr=1] Steps: 71%|███████ | 1062/1500 [11:28<04:42, 1.55it/s, loss=0.135, lr=1] Steps: 71%|███████ | 1063/1500 [11:28<04:42, 1.55it/s, loss=0.135, lr=1] Steps: 71%|███████ | 1063/1500 [11:28<04:42, 1.55it/s, loss=0.158, lr=1] Steps: 71%|███████ | 1064/1500 [11:29<04:41, 1.55it/s, loss=0.158, lr=1] Steps: 71%|███████ | 1064/1500 [11:29<04:41, 1.55it/s, loss=0.175, lr=1] Steps: 71%|███████ | 1065/1500 [11:30<04:40, 1.55it/s, loss=0.175, lr=1] Steps: 71%|███████ | 1065/1500 [11:30<04:40, 1.55it/s, loss=0.0599, lr=1] Steps: 71%|███████ | 1066/1500 [11:30<04:40, 1.55it/s, loss=0.0599, lr=1] Steps: 71%|███████ | 1066/1500 [11:30<04:40, 1.55it/s, loss=0.148, lr=1] Steps: 71%|███████ | 1067/1500 [11:31<04:39, 1.55it/s, loss=0.148, lr=1] Steps: 71%|███████ | 1067/1500 [11:31<04:39, 1.55it/s, loss=0.0743, lr=1] Steps: 71%|███████ | 1068/1500 [11:32<04:38, 1.55it/s, loss=0.0743, lr=1] Steps: 71%|███████ | 1068/1500 [11:32<04:38, 1.55it/s, loss=0.0792, lr=1] Steps: 71%|███████▏ | 1069/1500 [11:32<04:38, 1.55it/s, loss=0.0792, lr=1] Steps: 71%|███████▏ | 1069/1500 [11:32<04:38, 1.55it/s, loss=0.0823, lr=1] Steps: 71%|███████▏ | 1070/1500 [11:33<04:37, 1.55it/s, loss=0.0823, lr=1] Steps: 71%|███████▏ | 1070/1500 [11:33<04:37, 1.55it/s, loss=0.042, lr=1] Steps: 71%|███████▏ | 1071/1500 [11:34<04:36, 1.55it/s, loss=0.042, lr=1] Steps: 71%|███████▏ | 1071/1500 [11:34<04:36, 1.55it/s, loss=0.0881, lr=1] Steps: 71%|███████▏ | 1072/1500 [11:34<04:35, 1.55it/s, loss=0.0881, lr=1] Steps: 71%|███████▏ | 1072/1500 [11:34<04:35, 1.55it/s, loss=0.249, lr=1] Steps: 72%|███████▏ | 1073/1500 [11:35<04:37, 1.54it/s, loss=0.249, lr=1] Steps: 72%|███████▏ | 1073/1500 [11:35<04:37, 1.54it/s, loss=0.176, lr=1] Steps: 72%|███████▏ | 1074/1500 [11:35<04:35, 1.54it/s, loss=0.176, lr=1] Steps: 72%|███████▏ | 1074/1500 [11:35<04:35, 1.54it/s, loss=0.111, lr=1] Steps: 72%|███████▏ | 1075/1500 [11:36<04:34, 1.55it/s, loss=0.111, lr=1] Steps: 72%|███████▏ | 1075/1500 [11:36<04:34, 1.55it/s, loss=0.19, lr=1] Steps: 72%|███████▏ | 1076/1500 [11:37<04:34, 1.55it/s, loss=0.19, lr=1] Steps: 72%|███████▏ | 1076/1500 [11:37<04:34, 1.55it/s, loss=0.118, lr=1] Steps: 72%|███████▏ | 1077/1500 [11:37<04:33, 1.55it/s, loss=0.118, lr=1] Steps: 72%|███████▏ | 1077/1500 [11:37<04:33, 1.55it/s, loss=0.113, lr=1] Steps: 72%|███████▏ | 1078/1500 [11:38<04:32, 1.55it/s, loss=0.113, lr=1] Steps: 72%|███████▏ | 1078/1500 [11:38<04:32, 1.55it/s, loss=0.0998, lr=1] Steps: 72%|███████▏ | 1079/1500 [11:39<04:31, 1.55it/s, loss=0.0998, lr=1] Steps: 72%|███████▏ | 1079/1500 [11:39<04:31, 1.55it/s, loss=0.093, lr=1] Steps: 72%|███████▏ | 1080/1500 [11:39<04:31, 1.55it/s, loss=0.093, lr=1] Steps: 72%|███████▏ | 1080/1500 [11:39<04:31, 1.55it/s, loss=0.123, lr=1] Steps: 72%|███████▏ | 1081/1500 [11:40<04:30, 1.55it/s, loss=0.123, lr=1] Steps: 72%|███████▏ | 1081/1500 [11:40<04:30, 1.55it/s, loss=0.129, lr=1] Steps: 72%|███████▏ | 1082/1500 [11:41<04:29, 1.55it/s, loss=0.129, lr=1] Steps: 72%|███████▏ | 1082/1500 [11:41<04:29, 1.55it/s, loss=0.0877, lr=1] Steps: 72%|███████▏ | 1083/1500 [11:41<04:29, 1.55it/s, loss=0.0877, lr=1] Steps: 72%|███████▏ | 1083/1500 [11:41<04:29, 1.55it/s, loss=0.135, lr=1] Steps: 72%|███████▏ | 1084/1500 [11:42<04:28, 1.55it/s, loss=0.135, lr=1] Steps: 72%|███████▏ | 1084/1500 [11:42<04:28, 1.55it/s, loss=0.191, lr=1] Steps: 72%|███████▏ | 1085/1500 [11:43<04:27, 1.55it/s, loss=0.191, lr=1] Steps: 72%|███████▏ | 1085/1500 [11:43<04:27, 1.55it/s, loss=0.0872, lr=1] Steps: 72%|███████▏ | 1086/1500 [11:43<04:26, 1.55it/s, loss=0.0872, lr=1] Steps: 72%|███████▏ | 1086/1500 [11:43<04:26, 1.55it/s, loss=0.158, lr=1] Steps: 72%|███████▏ | 1087/1500 [11:44<04:26, 1.55it/s, loss=0.158, lr=1] Steps: 72%|███████▏ | 1087/1500 [11:44<04:26, 1.55it/s, loss=0.167, lr=1] Steps: 73%|███████▎ | 1088/1500 [11:45<04:25, 1.55it/s, loss=0.167, lr=1] Steps: 73%|███████▎ | 1088/1500 [11:45<04:25, 1.55it/s, loss=0.142, lr=1] Steps: 73%|███████▎ | 1089/1500 [11:45<04:26, 1.54it/s, loss=0.142, lr=1] Steps: 73%|███████▎ | 1089/1500 [11:45<04:26, 1.54it/s, loss=0.144, lr=1] Steps: 73%|███████▎ | 1090/1500 [11:46<04:25, 1.54it/s, loss=0.144, lr=1] Steps: 73%|███████▎ | 1090/1500 [11:46<04:25, 1.54it/s, loss=0.175, lr=1] Steps: 73%|███████▎ | 1091/1500 [11:46<04:24, 1.55it/s, loss=0.175, lr=1] Steps: 73%|███████▎ | 1091/1500 [11:46<04:24, 1.55it/s, loss=0.167, lr=1] Steps: 73%|███████▎ | 1092/1500 [11:47<04:23, 1.55it/s, loss=0.167, lr=1] Steps: 73%|███████▎ | 1092/1500 [11:47<04:23, 1.55it/s, loss=0.203, lr=1] Steps: 73%|███████▎ | 1093/1500 [11:48<04:22, 1.55it/s, loss=0.203, lr=1] Steps: 73%|███████▎ | 1093/1500 [11:48<04:22, 1.55it/s, loss=0.05, lr=1] Steps: 73%|███████▎ | 1094/1500 [11:48<04:22, 1.55it/s, loss=0.05, lr=1] Steps: 73%|███████▎ | 1094/1500 [11:48<04:22, 1.55it/s, loss=0.124, lr=1] Steps: 73%|███████▎ | 1095/1500 [11:49<04:21, 1.55it/s, loss=0.124, lr=1] Steps: 73%|███████▎ | 1095/1500 [11:49<04:21, 1.55it/s, loss=0.0726, lr=1] Steps: 73%|███████▎ | 1096/1500 [11:50<04:20, 1.55it/s, loss=0.0726, lr=1] Steps: 73%|███████▎ | 1096/1500 [11:50<04:20, 1.55it/s, loss=0.117, lr=1] Steps: 73%|███████▎ | 1097/1500 [11:50<04:20, 1.55it/s, loss=0.117, lr=1] Steps: 73%|███████▎ | 1097/1500 [11:50<04:20, 1.55it/s, loss=0.171, lr=1] Steps: 73%|███████▎ | 1098/1500 [11:51<04:19, 1.55it/s, loss=0.171, lr=1] Steps: 73%|███████▎ | 1098/1500 [11:51<04:19, 1.55it/s, loss=0.19, lr=1] Steps: 73%|███████▎ | 1099/1500 [11:52<04:18, 1.55it/s, loss=0.19, lr=1] Steps: 73%|███████▎ | 1099/1500 [11:52<04:18, 1.55it/s, loss=0.0613, lr=1] Steps: 73%|███████▎ | 1100/1500 [11:52<04:18, 1.55it/s, loss=0.0613, lr=1] Steps: 73%|███████▎ | 1100/1500 [11:52<04:18, 1.55it/s, loss=0.132, lr=1] Steps: 73%|███████▎ | 1101/1500 [11:53<04:17, 1.55it/s, loss=0.132, lr=1] Steps: 73%|███████▎ | 1101/1500 [11:53<04:17, 1.55it/s, loss=0.0821, lr=1] Steps: 73%|███████▎ | 1102/1500 [11:54<04:16, 1.55it/s, loss=0.0821, lr=1] Steps: 73%|███████▎ | 1102/1500 [11:54<04:16, 1.55it/s, loss=0.152, lr=1] Steps: 74%|███████▎ | 1103/1500 [11:54<04:16, 1.55it/s, loss=0.152, lr=1] Steps: 74%|███████▎ | 1103/1500 [11:54<04:16, 1.55it/s, loss=0.138, lr=1] Steps: 74%|███████▎ | 1104/1500 [11:55<04:15, 1.55it/s, loss=0.138, lr=1] Steps: 74%|███████▎ | 1104/1500 [11:55<04:15, 1.55it/s, loss=0.274, lr=1] Steps: 74%|███████▎ | 1105/1500 [11:56<04:16, 1.54it/s, loss=0.274, lr=1] Steps: 74%|███████▎ | 1105/1500 [11:56<04:16, 1.54it/s, loss=0.091, lr=1] Steps: 74%|███████▎ | 1106/1500 [11:56<04:15, 1.54it/s, loss=0.091, lr=1] Steps: 74%|███████▎ | 1106/1500 [11:56<04:15, 1.54it/s, loss=0.0875, lr=1] Steps: 74%|███████▍ | 1107/1500 [11:57<04:14, 1.55it/s, loss=0.0875, lr=1] Steps: 74%|███████▍ | 1107/1500 [11:57<04:14, 1.55it/s, loss=0.203, lr=1] Steps: 74%|███████▍ | 1108/1500 [11:57<04:13, 1.55it/s, loss=0.203, lr=1] Steps: 74%|███████▍ | 1108/1500 [11:57<04:13, 1.55it/s, loss=0.0384, lr=1] Steps: 74%|███████▍ | 1109/1500 [11:58<04:12, 1.55it/s, loss=0.0384, lr=1] Steps: 74%|███████▍ | 1109/1500 [11:58<04:12, 1.55it/s, loss=0.137, lr=1] Steps: 74%|███████▍ | 1110/1500 [11:59<04:11, 1.55it/s, loss=0.137, lr=1] Steps: 74%|███████▍ | 1110/1500 [11:59<04:11, 1.55it/s, loss=0.18, lr=1] Steps: 74%|███████▍ | 1111/1500 [11:59<04:11, 1.55it/s, loss=0.18, lr=1] Steps: 74%|███████▍ | 1111/1500 [11:59<04:11, 1.55it/s, loss=0.186, lr=1] Steps: 74%|███████▍ | 1112/1500 [12:00<04:10, 1.55it/s, loss=0.186, lr=1] Steps: 74%|███████▍ | 1112/1500 [12:00<04:10, 1.55it/s, loss=0.0793, lr=1] Steps: 74%|███████▍ | 1113/1500 [12:01<04:09, 1.55it/s, loss=0.0793, lr=1] Steps: 74%|███████▍ | 1113/1500 [12:01<04:09, 1.55it/s, loss=0.136, lr=1] Steps: 74%|███████▍ | 1114/1500 [12:01<04:09, 1.55it/s, loss=0.136, lr=1] Steps: 74%|███████▍ | 1114/1500 [12:01<04:09, 1.55it/s, loss=0.149, lr=1] Steps: 74%|███████▍ | 1115/1500 [12:02<04:08, 1.55it/s, loss=0.149, lr=1] Steps: 74%|███████▍ | 1115/1500 [12:02<04:08, 1.55it/s, loss=0.122, lr=1] Steps: 74%|███████▍ | 1116/1500 [12:03<04:07, 1.55it/s, loss=0.122, lr=1] Steps: 74%|███████▍ | 1116/1500 [12:03<04:07, 1.55it/s, loss=0.152, lr=1] Steps: 74%|███████▍ | 1117/1500 [12:03<04:07, 1.55it/s, loss=0.152, lr=1] Steps: 74%|███████▍ | 1117/1500 [12:03<04:07, 1.55it/s, loss=0.0338, lr=1] Steps: 75%|███████▍ | 1118/1500 [12:04<04:07, 1.54it/s, loss=0.0338, lr=1] Steps: 75%|███████▍ | 1118/1500 [12:04<04:07, 1.54it/s, loss=0.0932, lr=1] Steps: 75%|███████▍ | 1119/1500 [12:05<04:06, 1.55it/s, loss=0.0932, lr=1] Steps: 75%|███████▍ | 1119/1500 [12:05<04:06, 1.55it/s, loss=0.164, lr=1] Steps: 75%|███████▍ | 1120/1500 [12:05<04:05, 1.55it/s, loss=0.164, lr=1] Steps: 75%|███████▍ | 1120/1500 [12:05<04:05, 1.55it/s, loss=0.0811, lr=1] Steps: 75%|███████▍ | 1121/1500 [12:06<04:06, 1.54it/s, loss=0.0811, lr=1] Steps: 75%|███████▍ | 1121/1500 [12:06<04:06, 1.54it/s, loss=0.104, lr=1] Steps: 75%|███████▍ | 1122/1500 [12:06<04:05, 1.54it/s, loss=0.104, lr=1] Steps: 75%|███████▍ | 1122/1500 [12:06<04:05, 1.54it/s, loss=0.125, lr=1] Steps: 75%|███████▍ | 1123/1500 [12:07<04:04, 1.54it/s, loss=0.125, lr=1] Steps: 75%|███████▍ | 1123/1500 [12:07<04:04, 1.54it/s, loss=0.131, lr=1] Steps: 75%|███████▍ | 1124/1500 [12:08<04:03, 1.55it/s, loss=0.131, lr=1] Steps: 75%|███████▍ | 1124/1500 [12:08<04:03, 1.55it/s, loss=0.123, lr=1] Steps: 75%|███████▌ | 1125/1500 [12:08<04:02, 1.55it/s, loss=0.123, lr=1] Steps: 75%|███████▌ | 1125/1500 [12:08<04:02, 1.55it/s, loss=0.0789, lr=1] Steps: 75%|███████▌ | 1126/1500 [12:09<04:01, 1.55it/s, loss=0.0789, lr=1] Steps: 75%|███████▌ | 1126/1500 [12:09<04:01, 1.55it/s, loss=0.101, lr=1] Steps: 75%|███████▌ | 1127/1500 [12:10<04:00, 1.55it/s, loss=0.101, lr=1] Steps: 75%|███████▌ | 1127/1500 [12:10<04:00, 1.55it/s, loss=0.165, lr=1] Steps: 75%|███████▌ | 1128/1500 [12:10<04:00, 1.55it/s, loss=0.165, lr=1] Steps: 75%|███████▌ | 1128/1500 [12:10<04:00, 1.55it/s, loss=0.0926, lr=1] Steps: 75%|███████▌ | 1129/1500 [12:11<03:59, 1.55it/s, loss=0.0926, lr=1] Steps: 75%|███████▌ | 1129/1500 [12:11<03:59, 1.55it/s, loss=0.0838, lr=1] Steps: 75%|███████▌ | 1130/1500 [12:12<03:58, 1.55it/s, loss=0.0838, lr=1] Steps: 75%|███████▌ | 1130/1500 [12:12<03:58, 1.55it/s, loss=0.103, lr=1] Steps: 75%|███████▌ | 1131/1500 [12:12<03:58, 1.55it/s, loss=0.103, lr=1] Steps: 75%|███████▌ | 1131/1500 [12:12<03:58, 1.55it/s, loss=0.117, lr=1] Steps: 75%|███████▌ | 1132/1500 [12:13<03:57, 1.55it/s, loss=0.117, lr=1] Steps: 75%|███████▌ | 1132/1500 [12:13<03:57, 1.55it/s, loss=0.115, lr=1] Steps: 76%|███████▌ | 1133/1500 [12:14<03:56, 1.55it/s, loss=0.115, lr=1] Steps: 76%|███████▌ | 1133/1500 [12:14<03:56, 1.55it/s, loss=0.32, lr=1] Steps: 76%|███████▌ | 1134/1500 [12:14<03:56, 1.55it/s, loss=0.32, lr=1] Steps: 76%|███████▌ | 1134/1500 [12:14<03:56, 1.55it/s, loss=0.155, lr=1] Steps: 76%|███████▌ | 1135/1500 [12:15<03:55, 1.55it/s, loss=0.155, lr=1] Steps: 76%|███████▌ | 1135/1500 [12:15<03:55, 1.55it/s, loss=0.21, lr=1] Steps: 76%|███████▌ | 1136/1500 [12:16<03:54, 1.55it/s, loss=0.21, lr=1] Steps: 76%|███████▌ | 1136/1500 [12:16<03:54, 1.55it/s, loss=0.172, lr=1] Steps: 76%|███████▌ | 1137/1500 [12:16<03:55, 1.54it/s, loss=0.172, lr=1] Steps: 76%|███████▌ | 1137/1500 [12:16<03:55, 1.54it/s, loss=0.156, lr=1] Steps: 76%|███████▌ | 1138/1500 [12:17<03:54, 1.54it/s, loss=0.156, lr=1] Steps: 76%|███████▌ | 1138/1500 [12:17<03:54, 1.54it/s, loss=0.122, lr=1] Steps: 76%|███████▌ | 1139/1500 [12:17<03:53, 1.55it/s, loss=0.122, lr=1] Steps: 76%|███████▌ | 1139/1500 [12:17<03:53, 1.55it/s, loss=0.137, lr=1] Steps: 76%|███████▌ | 1140/1500 [12:18<03:52, 1.55it/s, loss=0.137, lr=1] Steps: 76%|███████▌ | 1140/1500 [12:18<03:52, 1.55it/s, loss=0.0809, lr=1] Steps: 76%|███████▌ | 1141/1500 [12:19<03:51, 1.55it/s, loss=0.0809, lr=1] Steps: 76%|███████▌ | 1141/1500 [12:19<03:51, 1.55it/s, loss=0.109, lr=1] Steps: 76%|███████▌ | 1142/1500 [12:19<03:51, 1.55it/s, loss=0.109, lr=1] Steps: 76%|███████▌ | 1142/1500 [12:19<03:51, 1.55it/s, loss=0.054, lr=1] Steps: 76%|███████▌ | 1143/1500 [12:20<03:50, 1.55it/s, loss=0.054, lr=1] Steps: 76%|███████▌ | 1143/1500 [12:20<03:50, 1.55it/s, loss=0.0625, lr=1] Steps: 76%|███████▋ | 1144/1500 [12:21<03:49, 1.55it/s, loss=0.0625, lr=1] Steps: 76%|███████▋ | 1144/1500 [12:21<03:49, 1.55it/s, loss=0.118, lr=1] Steps: 76%|███████▋ | 1145/1500 [12:21<03:49, 1.55it/s, loss=0.118, lr=1] Steps: 76%|███████▋ | 1145/1500 [12:21<03:49, 1.55it/s, loss=0.0942, lr=1] Steps: 76%|███████▋ | 1146/1500 [12:22<03:48, 1.55it/s, loss=0.0942, lr=1] Steps: 76%|███████▋ | 1146/1500 [12:22<03:48, 1.55it/s, loss=0.0837, lr=1] Steps: 76%|███████▋ | 1147/1500 [12:23<03:48, 1.55it/s, loss=0.0837, lr=1] Steps: 76%|███████▋ | 1147/1500 [12:23<03:48, 1.55it/s, loss=0.0677, lr=1] Steps: 77%|███████▋ | 1148/1500 [12:23<03:47, 1.55it/s, loss=0.0677, lr=1] Steps: 77%|███████▋ | 1148/1500 [12:23<03:47, 1.55it/s, loss=0.197, lr=1] Steps: 77%|███████▋ | 1149/1500 [12:24<03:46, 1.55it/s, loss=0.197, lr=1] Steps: 77%|███████▋ | 1149/1500 [12:24<03:46, 1.55it/s, loss=0.229, lr=1] Steps: 77%|███████▋ | 1150/1500 [12:25<03:46, 1.55it/s, loss=0.229, lr=1] Steps: 77%|███████▋ | 1150/1500 [12:25<03:46, 1.55it/s, loss=0.148, lr=1] Steps: 77%|███████▋ | 1151/1500 [12:25<03:45, 1.55it/s, loss=0.148, lr=1] Steps: 77%|███████▋ | 1151/1500 [12:25<03:45, 1.55it/s, loss=0.0846, lr=1] Steps: 77%|███████▋ | 1152/1500 [12:26<03:44, 1.55it/s, loss=0.0846, lr=1] Steps: 77%|███████▋ | 1152/1500 [12:26<03:44, 1.55it/s, loss=0.0456, lr=1] Steps: 77%|███████▋ | 1153/1500 [12:27<03:45, 1.54it/s, loss=0.0456, lr=1] Steps: 77%|███████▋ | 1153/1500 [12:27<03:45, 1.54it/s, loss=0.0381, lr=1] Steps: 77%|███████▋ | 1154/1500 [12:27<03:44, 1.54it/s, loss=0.0381, lr=1] Steps: 77%|███████▋ | 1154/1500 [12:27<03:44, 1.54it/s, loss=0.0959, lr=1] Steps: 77%|███████▋ | 1155/1500 [12:28<03:43, 1.54it/s, loss=0.0959, lr=1] Steps: 77%|███████▋ | 1155/1500 [12:28<03:43, 1.54it/s, loss=0.251, lr=1] Steps: 77%|███████▋ | 1156/1500 [12:28<03:42, 1.55it/s, loss=0.251, lr=1] Steps: 77%|███████▋ | 1156/1500 [12:28<03:42, 1.55it/s, loss=0.0685, lr=1] Steps: 77%|███████▋ | 1157/1500 [12:29<03:41, 1.55it/s, loss=0.0685, lr=1] Steps: 77%|███████▋ | 1157/1500 [12:29<03:41, 1.55it/s, loss=0.0704, lr=1] Steps: 77%|███████▋ | 1158/1500 [12:30<03:40, 1.55it/s, loss=0.0704, lr=1] Steps: 77%|███████▋ | 1158/1500 [12:30<03:40, 1.55it/s, loss=0.124, lr=1] Steps: 77%|███████▋ | 1159/1500 [12:30<03:40, 1.55it/s, loss=0.124, lr=1] Steps: 77%|███████▋ | 1159/1500 [12:30<03:40, 1.55it/s, loss=0.0207, lr=1] Steps: 77%|███████▋ | 1160/1500 [12:31<03:39, 1.55it/s, loss=0.0207, lr=1] Steps: 77%|███████▋ | 1160/1500 [12:31<03:39, 1.55it/s, loss=0.167, lr=1] Steps: 77%|███████▋ | 1161/1500 [12:32<03:38, 1.55it/s, loss=0.167, lr=1] Steps: 77%|███████▋ | 1161/1500 [12:32<03:38, 1.55it/s, loss=0.0681, lr=1] Steps: 77%|███████▋ | 1162/1500 [12:32<03:38, 1.55it/s, loss=0.0681, lr=1] Steps: 77%|███████▋ | 1162/1500 [12:32<03:38, 1.55it/s, loss=0.0589, lr=1] Steps: 78%|███████▊ | 1163/1500 [12:33<03:37, 1.55it/s, loss=0.0589, lr=1] Steps: 78%|███████▊ | 1163/1500 [12:33<03:37, 1.55it/s, loss=0.108, lr=1] Steps: 78%|███████▊ | 1164/1500 [12:34<03:36, 1.55it/s, loss=0.108, lr=1] Steps: 78%|███████▊ | 1164/1500 [12:34<03:36, 1.55it/s, loss=0.29, lr=1] Steps: 78%|███████▊ | 1165/1500 [12:34<03:36, 1.54it/s, loss=0.29, lr=1] Steps: 78%|███████▊ | 1165/1500 [12:34<03:36, 1.54it/s, loss=0.122, lr=1] Steps: 78%|███████▊ | 1166/1500 [12:35<03:36, 1.55it/s, loss=0.122, lr=1] Steps: 78%|███████▊ | 1166/1500 [12:35<03:36, 1.55it/s, loss=0.0519, lr=1] Steps: 78%|███████▊ | 1167/1500 [12:36<03:35, 1.55it/s, loss=0.0519, lr=1] Steps: 78%|███████▊ | 1167/1500 [12:36<03:35, 1.55it/s, loss=0.1, lr=1] Steps: 78%|███████▊ | 1168/1500 [12:36<03:34, 1.55it/s, loss=0.1, lr=1] Steps: 78%|███████▊ | 1168/1500 [12:36<03:34, 1.55it/s, loss=0.132, lr=1] Steps: 78%|███████▊ | 1169/1500 [12:37<03:36, 1.53it/s, loss=0.132, lr=1] Steps: 78%|███████▊ | 1169/1500 [12:37<03:36, 1.53it/s, loss=0.121, lr=1] Steps: 78%|███████▊ | 1170/1500 [12:38<03:35, 1.53it/s, loss=0.121, lr=1] Steps: 78%|███████▊ | 1170/1500 [12:38<03:35, 1.53it/s, loss=0.306, lr=1] Steps: 78%|███████▊ | 1171/1500 [12:38<03:34, 1.53it/s, loss=0.306, lr=1] Steps: 78%|███████▊ | 1171/1500 [12:38<03:34, 1.53it/s, loss=0.0996, lr=1] Steps: 78%|███████▊ | 1172/1500 [12:39<03:33, 1.54it/s, loss=0.0996, lr=1] Steps: 78%|███████▊ | 1172/1500 [12:39<03:33, 1.54it/s, loss=0.147, lr=1] Steps: 78%|███████▊ | 1173/1500 [12:39<03:32, 1.54it/s, loss=0.147, lr=1] Steps: 78%|███████▊ | 1173/1500 [12:39<03:32, 1.54it/s, loss=0.134, lr=1] Steps: 78%|███████▊ | 1174/1500 [12:40<03:31, 1.54it/s, loss=0.134, lr=1] Steps: 78%|███████▊ | 1174/1500 [12:40<03:31, 1.54it/s, loss=0.174, lr=1] Steps: 78%|███████▊ | 1175/1500 [12:41<03:30, 1.54it/s, loss=0.174, lr=1] Steps: 78%|███████▊ | 1175/1500 [12:41<03:30, 1.54it/s, loss=0.106, lr=1] Steps: 78%|███████▊ | 1176/1500 [12:41<03:29, 1.54it/s, loss=0.106, lr=1] Steps: 78%|███████▊ | 1176/1500 [12:41<03:29, 1.54it/s, loss=0.148, lr=1] Steps: 78%|███████▊ | 1177/1500 [12:42<03:29, 1.55it/s, loss=0.148, lr=1] Steps: 78%|███████▊ | 1177/1500 [12:42<03:29, 1.55it/s, loss=0.0922, lr=1] Steps: 79%|███████▊ | 1178/1500 [12:43<03:28, 1.54it/s, loss=0.0922, lr=1] Steps: 79%|███████▊ | 1178/1500 [12:43<03:28, 1.54it/s, loss=0.194, lr=1] Steps: 79%|███████▊ | 1179/1500 [12:43<03:27, 1.54it/s, loss=0.194, lr=1] Steps: 79%|███████▊ | 1179/1500 [12:43<03:27, 1.54it/s, loss=0.135, lr=1] Steps: 79%|███████▊ | 1180/1500 [12:44<03:39, 1.46it/s, loss=0.135, lr=1] Steps: 79%|███████▊ | 1180/1500 [12:44<03:39, 1.46it/s, loss=0.0799, lr=1] Steps: 79%|███████▊ | 1181/1500 [12:45<03:34, 1.48it/s, loss=0.0799, lr=1] Steps: 79%|███████▊ | 1181/1500 [12:45<03:34, 1.48it/s, loss=0.225, lr=1] Steps: 79%|███████▉ | 1182/1500 [12:45<03:31, 1.50it/s, loss=0.225, lr=1] Steps: 79%|███████▉ | 1182/1500 [12:45<03:31, 1.50it/s, loss=0.0779, lr=1] Steps: 79%|███████▉ | 1183/1500 [12:46<03:28, 1.52it/s, loss=0.0779, lr=1] Steps: 79%|███████▉ | 1183/1500 [12:46<03:28, 1.52it/s, loss=0.184, lr=1] Steps: 79%|███████▉ | 1184/1500 [12:47<03:26, 1.53it/s, loss=0.184, lr=1] Steps: 79%|███████▉ | 1184/1500 [12:47<03:26, 1.53it/s, loss=0.0466, lr=1] Steps: 79%|███████▉ | 1185/1500 [12:47<03:26, 1.53it/s, loss=0.0466, lr=1] Steps: 79%|███████▉ | 1185/1500 [12:47<03:26, 1.53it/s, loss=0.239, lr=1] Steps: 79%|███████▉ | 1186/1500 [12:48<03:24, 1.53it/s, loss=0.239, lr=1] Steps: 79%|███████▉ | 1186/1500 [12:48<03:24, 1.53it/s, loss=0.115, lr=1] Steps: 79%|███████▉ | 1187/1500 [12:49<03:23, 1.54it/s, loss=0.115, lr=1] Steps: 79%|███████▉ | 1187/1500 [12:49<03:23, 1.54it/s, loss=0.112, lr=1] Steps: 79%|███████▉ | 1188/1500 [12:49<03:22, 1.54it/s, loss=0.112, lr=1] Steps: 79%|███████▉ | 1188/1500 [12:49<03:22, 1.54it/s, loss=0.0711, lr=1] Steps: 79%|███████▉ | 1189/1500 [12:50<03:21, 1.54it/s, loss=0.0711, lr=1] Steps: 79%|███████▉ | 1189/1500 [12:50<03:21, 1.54it/s, loss=0.0877, lr=1] Steps: 79%|███████▉ | 1190/1500 [12:51<03:20, 1.54it/s, loss=0.0877, lr=1] Steps: 79%|███████▉ | 1190/1500 [12:51<03:20, 1.54it/s, loss=0.0816, lr=1] Steps: 79%|███████▉ | 1191/1500 [12:51<03:19, 1.55it/s, loss=0.0816, lr=1] Steps: 79%|███████▉ | 1191/1500 [12:51<03:19, 1.55it/s, loss=0.0971, lr=1] Steps: 79%|███████▉ | 1192/1500 [12:52<03:19, 1.55it/s, loss=0.0971, lr=1] Steps: 79%|███████▉ | 1192/1500 [12:52<03:19, 1.55it/s, loss=0.171, lr=1] Steps: 80%|███████▉ | 1193/1500 [12:53<03:18, 1.55it/s, loss=0.171, lr=1] Steps: 80%|███████▉ | 1193/1500 [12:53<03:18, 1.55it/s, loss=0.338, lr=1] Steps: 80%|███████▉ | 1194/1500 [12:53<03:17, 1.55it/s, loss=0.338, lr=1] Steps: 80%|███████▉ | 1194/1500 [12:53<03:17, 1.55it/s, loss=0.146, lr=1] Steps: 80%|███████▉ | 1195/1500 [12:54<03:17, 1.55it/s, loss=0.146, lr=1] Steps: 80%|███████▉ | 1195/1500 [12:54<03:17, 1.55it/s, loss=0.218, lr=1] Steps: 80%|███████▉ | 1196/1500 [12:54<03:16, 1.55it/s, loss=0.218, lr=1] Steps: 80%|███████▉ | 1196/1500 [12:54<03:16, 1.55it/s, loss=0.0801, lr=1] Steps: 80%|███████▉ | 1197/1500 [12:55<03:16, 1.54it/s, loss=0.0801, lr=1] Steps: 80%|███████▉ | 1197/1500 [12:55<03:16, 1.54it/s, loss=0.0925, lr=1] Steps: 80%|███████▉ | 1198/1500 [12:56<03:15, 1.54it/s, loss=0.0925, lr=1] Steps: 80%|███████▉ | 1198/1500 [12:56<03:15, 1.54it/s, loss=0.152, lr=1] Steps: 80%|███████▉ | 1199/1500 [12:56<03:14, 1.54it/s, loss=0.152, lr=1] Steps: 80%|███████▉ | 1199/1500 [12:56<03:14, 1.54it/s, loss=0.138, lr=1] Steps: 80%|████████ | 1200/1500 [12:57<03:14, 1.55it/s, loss=0.138, lr=1] Steps: 80%|████████ | 1200/1500 [12:57<03:14, 1.55it/s, loss=0.146, lr=1] Steps: 80%|████████ | 1201/1500 [12:58<03:14, 1.54it/s, loss=0.146, lr=1] Steps: 80%|████████ | 1201/1500 [12:58<03:14, 1.54it/s, loss=0.0313, lr=1] Steps: 80%|████████ | 1202/1500 [12:58<03:13, 1.54it/s, loss=0.0313, lr=1] Steps: 80%|████████ | 1202/1500 [12:58<03:13, 1.54it/s, loss=0.0984, lr=1] Steps: 80%|████████ | 1203/1500 [12:59<03:12, 1.54it/s, loss=0.0984, lr=1] Steps: 80%|████████ | 1203/1500 [12:59<03:12, 1.54it/s, loss=0.0619, lr=1] Steps: 80%|████████ | 1204/1500 [13:00<03:11, 1.54it/s, loss=0.0619, lr=1] Steps: 80%|████████ | 1204/1500 [13:00<03:11, 1.54it/s, loss=0.0622, lr=1] Steps: 80%|████████ | 1205/1500 [13:00<03:11, 1.54it/s, loss=0.0622, lr=1] Steps: 80%|████████ | 1205/1500 [13:00<03:11, 1.54it/s, loss=0.128, lr=1] Steps: 80%|████████ | 1206/1500 [13:01<03:10, 1.55it/s, loss=0.128, lr=1] Steps: 80%|████████ | 1206/1500 [13:01<03:10, 1.55it/s, loss=0.034, lr=1] Steps: 80%|████████ | 1207/1500 [13:02<03:09, 1.55it/s, loss=0.034, lr=1] Steps: 80%|████████ | 1207/1500 [13:02<03:09, 1.55it/s, loss=0.124, lr=1] Steps: 81%|████████ | 1208/1500 [13:02<03:08, 1.55it/s, loss=0.124, lr=1] Steps: 81%|████████ | 1208/1500 [13:02<03:08, 1.55it/s, loss=0.117, lr=1] Steps: 81%|████████ | 1209/1500 [13:03<03:07, 1.55it/s, loss=0.117, lr=1] Steps: 81%|████████ | 1209/1500 [13:03<03:07, 1.55it/s, loss=0.14, lr=1] Steps: 81%|████████ | 1210/1500 [13:04<03:07, 1.55it/s, loss=0.14, lr=1] Steps: 81%|████████ | 1210/1500 [13:04<03:07, 1.55it/s, loss=0.104, lr=1] Steps: 81%|████████ | 1211/1500 [13:04<03:06, 1.55it/s, loss=0.104, lr=1] Steps: 81%|████████ | 1211/1500 [13:04<03:06, 1.55it/s, loss=0.241, lr=1] Steps: 81%|████████ | 1212/1500 [13:05<03:06, 1.55it/s, loss=0.241, lr=1] Steps: 81%|████████ | 1212/1500 [13:05<03:06, 1.55it/s, loss=0.109, lr=1] Steps: 81%|████████ | 1213/1500 [13:05<03:05, 1.54it/s, loss=0.109, lr=1] Steps: 81%|████████ | 1213/1500 [13:05<03:05, 1.54it/s, loss=0.0809, lr=1] Steps: 81%|████████ | 1214/1500 [13:06<03:05, 1.54it/s, loss=0.0809, lr=1] Steps: 81%|████████ | 1214/1500 [13:06<03:05, 1.54it/s, loss=0.0724, lr=1] Steps: 81%|████████ | 1215/1500 [13:07<03:04, 1.54it/s, loss=0.0724, lr=1] Steps: 81%|████████ | 1215/1500 [13:07<03:04, 1.54it/s, loss=0.146, lr=1] Steps: 81%|████████ | 1216/1500 [13:07<03:03, 1.55it/s, loss=0.146, lr=1] Steps: 81%|████████ | 1216/1500 [13:07<03:03, 1.55it/s, loss=0.0946, lr=1] Steps: 81%|████████ | 1217/1500 [13:08<03:04, 1.54it/s, loss=0.0946, lr=1] Steps: 81%|████████ | 1217/1500 [13:08<03:04, 1.54it/s, loss=0.161, lr=1] Steps: 81%|████████ | 1218/1500 [13:09<03:03, 1.54it/s, loss=0.161, lr=1] Steps: 81%|████████ | 1218/1500 [13:09<03:03, 1.54it/s, loss=0.0411, lr=1] Steps: 81%|████████▏ | 1219/1500 [13:09<03:02, 1.54it/s, loss=0.0411, lr=1] Steps: 81%|████████▏ | 1219/1500 [13:09<03:02, 1.54it/s, loss=0.0622, lr=1] Steps: 81%|████████▏ | 1220/1500 [13:10<03:01, 1.54it/s, loss=0.0622, lr=1] Steps: 81%|████████▏ | 1220/1500 [13:10<03:01, 1.54it/s, loss=0.302, lr=1] Steps: 81%|████████▏ | 1221/1500 [13:11<03:00, 1.54it/s, loss=0.302, lr=1] Steps: 81%|████████▏ | 1221/1500 [13:11<03:00, 1.54it/s, loss=0.197, lr=1] Steps: 81%|████████▏ | 1222/1500 [13:11<03:00, 1.54it/s, loss=0.197, lr=1] Steps: 81%|████████▏ | 1222/1500 [13:11<03:00, 1.54it/s, loss=0.2, lr=1] Steps: 82%|████████▏ | 1223/1500 [13:12<02:59, 1.54it/s, loss=0.2, lr=1] Steps: 82%|████████▏ | 1223/1500 [13:12<02:59, 1.54it/s, loss=0.278, lr=1] Steps: 82%|████████▏ | 1224/1500 [13:13<02:58, 1.54it/s, loss=0.278, lr=1] Steps: 82%|████████▏ | 1224/1500 [13:13<02:58, 1.54it/s, loss=0.123, lr=1] Steps: 82%|████████▏ | 1225/1500 [13:13<02:58, 1.54it/s, loss=0.123, lr=1] Steps: 82%|████████▏ | 1225/1500 [13:13<02:58, 1.54it/s, loss=0.0647, lr=1] Steps: 82%|████████▏ | 1226/1500 [13:14<02:57, 1.54it/s, loss=0.0647, lr=1] Steps: 82%|████████▏ | 1226/1500 [13:14<02:57, 1.54it/s, loss=0.0565, lr=1] Steps: 82%|████████▏ | 1227/1500 [13:15<02:56, 1.55it/s, loss=0.0565, lr=1] Steps: 82%|████████▏ | 1227/1500 [13:15<02:56, 1.55it/s, loss=0.173, lr=1] Steps: 82%|████████▏ | 1228/1500 [13:15<02:55, 1.55it/s, loss=0.173, lr=1] Steps: 82%|████████▏ | 1228/1500 [13:15<02:55, 1.55it/s, loss=0.106, lr=1] Steps: 82%|████████▏ | 1229/1500 [13:16<02:55, 1.55it/s, loss=0.106, lr=1] Steps: 82%|████████▏ | 1229/1500 [13:16<02:55, 1.55it/s, loss=0.0923, lr=1] Steps: 82%|████████▏ | 1230/1500 [13:16<02:54, 1.55it/s, loss=0.0923, lr=1] Steps: 82%|████████▏ | 1230/1500 [13:16<02:54, 1.55it/s, loss=0.0842, lr=1] Steps: 82%|████████▏ | 1231/1500 [13:17<02:53, 1.55it/s, loss=0.0842, lr=1] Steps: 82%|████████▏ | 1231/1500 [13:17<02:53, 1.55it/s, loss=0.135, lr=1] Steps: 82%|████████▏ | 1232/1500 [13:18<02:52, 1.55it/s, loss=0.135, lr=1] Steps: 82%|████████▏ | 1232/1500 [13:18<02:52, 1.55it/s, loss=0.32, lr=1] Steps: 82%|████████▏ | 1233/1500 [13:18<02:53, 1.54it/s, loss=0.32, lr=1] Steps: 82%|████████▏ | 1233/1500 [13:18<02:53, 1.54it/s, loss=0.161, lr=1] Steps: 82%|████████▏ | 1234/1500 [13:19<02:52, 1.54it/s, loss=0.161, lr=1] Steps: 82%|████████▏ | 1234/1500 [13:19<02:52, 1.54it/s, loss=0.123, lr=1] Steps: 82%|████████▏ | 1235/1500 [13:20<02:51, 1.55it/s, loss=0.123, lr=1] Steps: 82%|████████▏ | 1235/1500 [13:20<02:51, 1.55it/s, loss=0.276, lr=1] Steps: 82%|████████▏ | 1236/1500 [13:20<02:50, 1.55it/s, loss=0.276, lr=1] Steps: 82%|████████▏ | 1236/1500 [13:20<02:50, 1.55it/s, loss=0.184, lr=1] Steps: 82%|████████▏ | 1237/1500 [13:21<02:49, 1.55it/s, loss=0.184, lr=1] Steps: 82%|████████▏ | 1237/1500 [13:21<02:49, 1.55it/s, loss=0.0454, lr=1] Steps: 83%|████████▎ | 1238/1500 [13:22<02:49, 1.55it/s, loss=0.0454, lr=1] Steps: 83%|████████▎ | 1238/1500 [13:22<02:49, 1.55it/s, loss=0.164, lr=1] Steps: 83%|████████▎ | 1239/1500 [13:22<02:48, 1.55it/s, loss=0.164, lr=1] Steps: 83%|████████▎ | 1239/1500 [13:22<02:48, 1.55it/s, loss=0.0809, lr=1] Steps: 83%|████████▎ | 1240/1500 [13:23<02:47, 1.55it/s, loss=0.0809, lr=1] Steps: 83%|████████▎ | 1240/1500 [13:23<02:47, 1.55it/s, loss=0.113, lr=1] Steps: 83%|████████▎ | 1241/1500 [13:24<02:47, 1.55it/s, loss=0.113, lr=1] Steps: 83%|████████▎ | 1241/1500 [13:24<02:47, 1.55it/s, loss=0.0706, lr=1] Steps: 83%|████████▎ | 1242/1500 [13:24<02:47, 1.54it/s, loss=0.0706, lr=1] Steps: 83%|████████▎ | 1242/1500 [13:24<02:47, 1.54it/s, loss=0.0901, lr=1] Steps: 83%|████████▎ | 1243/1500 [13:25<02:46, 1.54it/s, loss=0.0901, lr=1] Steps: 83%|████████▎ | 1243/1500 [13:25<02:46, 1.54it/s, loss=0.296, lr=1] Steps: 83%|████████▎ | 1244/1500 [13:26<02:45, 1.54it/s, loss=0.296, lr=1] Steps: 83%|████████▎ | 1244/1500 [13:26<02:45, 1.54it/s, loss=0.21, lr=1] Steps: 83%|████████▎ | 1245/1500 [13:26<02:45, 1.54it/s, loss=0.21, lr=1] Steps: 83%|████████▎ | 1245/1500 [13:26<02:45, 1.54it/s, loss=0.135, lr=1] Steps: 83%|████████▎ | 1246/1500 [13:27<02:44, 1.55it/s, loss=0.135, lr=1] Steps: 83%|████████▎ | 1246/1500 [13:27<02:44, 1.55it/s, loss=0.098, lr=1] Steps: 83%|████████▎ | 1247/1500 [13:27<02:43, 1.55it/s, loss=0.098, lr=1] Steps: 83%|████████▎ | 1247/1500 [13:27<02:43, 1.55it/s, loss=0.152, lr=1] Steps: 83%|████████▎ | 1248/1500 [13:28<02:42, 1.55it/s, loss=0.152, lr=1] Steps: 83%|████████▎ | 1248/1500 [13:28<02:42, 1.55it/s, loss=0.194, lr=1] Steps: 83%|████████▎ | 1249/1500 [13:29<02:42, 1.54it/s, loss=0.194, lr=1] Steps: 83%|████████▎ | 1249/1500 [13:29<02:42, 1.54it/s, loss=0.0293, lr=1] Steps: 83%|████████▎ | 1250/1500 [13:29<02:42, 1.54it/s, loss=0.0293, lr=1] Steps: 83%|████████▎ | 1250/1500 [13:29<02:42, 1.54it/s, loss=0.127, lr=1] Steps: 83%|████████▎ | 1251/1500 [13:30<02:41, 1.55it/s, loss=0.127, lr=1] Steps: 83%|████████▎ | 1251/1500 [13:30<02:41, 1.55it/s, loss=0.0657, lr=1] Steps: 83%|████████▎ | 1252/1500 [13:31<02:40, 1.55it/s, loss=0.0657, lr=1] Steps: 83%|████████▎ | 1252/1500 [13:31<02:40, 1.55it/s, loss=0.145, lr=1] Steps: 84%|████████▎ | 1253/1500 [13:31<02:39, 1.55it/s, loss=0.145, lr=1] Steps: 84%|████████▎ | 1253/1500 [13:31<02:39, 1.55it/s, loss=0.0928, lr=1] Steps: 84%|████████▎ | 1254/1500 [13:32<02:38, 1.55it/s, loss=0.0928, lr=1] Steps: 84%|████████▎ | 1254/1500 [13:32<02:38, 1.55it/s, loss=0.0796, lr=1] Steps: 84%|████████▎ | 1255/1500 [13:33<02:38, 1.55it/s, loss=0.0796, lr=1] Steps: 84%|████████▎ | 1255/1500 [13:33<02:38, 1.55it/s, loss=0.073, lr=1] Steps: 84%|████████▎ | 1256/1500 [13:33<02:37, 1.55it/s, loss=0.073, lr=1] Steps: 84%|████████▎ | 1256/1500 [13:33<02:37, 1.55it/s, loss=0.162, lr=1] Steps: 84%|████████▍ | 1257/1500 [13:34<02:36, 1.55it/s, loss=0.162, lr=1] Steps: 84%|████████▍ | 1257/1500 [13:34<02:36, 1.55it/s, loss=0.123, lr=1] Steps: 84%|████████▍ | 1258/1500 [13:35<02:36, 1.55it/s, loss=0.123, lr=1] Steps: 84%|████████▍ | 1258/1500 [13:35<02:36, 1.55it/s, loss=0.315, lr=1] Steps: 84%|████████▍ | 1259/1500 [13:35<02:35, 1.55it/s, loss=0.315, lr=1] Steps: 84%|████████▍ | 1259/1500 [13:35<02:35, 1.55it/s, loss=0.156, lr=1] Steps: 84%|████████▍ | 1260/1500 [13:36<02:34, 1.55it/s, loss=0.156, lr=1] Steps: 84%|████████▍ | 1260/1500 [13:36<02:34, 1.55it/s, loss=0.186, lr=1] Steps: 84%|████████▍ | 1261/1500 [13:37<02:34, 1.55it/s, loss=0.186, lr=1] Steps: 84%|████████▍ | 1261/1500 [13:37<02:34, 1.55it/s, loss=0.0822, lr=1] Steps: 84%|████████▍ | 1262/1500 [13:37<02:33, 1.55it/s, loss=0.0822, lr=1] Steps: 84%|████████▍ | 1262/1500 [13:37<02:33, 1.55it/s, loss=0.2, lr=1] Steps: 84%|████████▍ | 1263/1500 [13:38<02:32, 1.55it/s, loss=0.2, lr=1] Steps: 84%|████████▍ | 1263/1500 [13:38<02:32, 1.55it/s, loss=0.171, lr=1] Steps: 84%|████████▍ | 1264/1500 [13:38<02:32, 1.55it/s, loss=0.171, lr=1] Steps: 84%|████████▍ | 1264/1500 [13:38<02:32, 1.55it/s, loss=0.231, lr=1] Steps: 84%|████████▍ | 1265/1500 [13:39<02:32, 1.54it/s, loss=0.231, lr=1] Steps: 84%|████████▍ | 1265/1500 [13:39<02:32, 1.54it/s, loss=0.183, lr=1] Steps: 84%|████████▍ | 1266/1500 [13:40<02:31, 1.54it/s, loss=0.183, lr=1] Steps: 84%|████████▍ | 1266/1500 [13:40<02:31, 1.54it/s, loss=0.137, lr=1] Steps: 84%|████████▍ | 1267/1500 [13:40<02:30, 1.54it/s, loss=0.137, lr=1] Steps: 84%|████████▍ | 1267/1500 [13:40<02:30, 1.54it/s, loss=0.119, lr=1] Steps: 85%|████████▍ | 1268/1500 [13:41<02:30, 1.54it/s, loss=0.119, lr=1] Steps: 85%|████████▍ | 1268/1500 [13:41<02:30, 1.54it/s, loss=0.0888, lr=1] Steps: 85%|████████▍ | 1269/1500 [13:42<02:29, 1.54it/s, loss=0.0888, lr=1] Steps: 85%|████████▍ | 1269/1500 [13:42<02:29, 1.54it/s, loss=0.063, lr=1] Steps: 85%|████████▍ | 1270/1500 [13:42<02:28, 1.55it/s, loss=0.063, lr=1] Steps: 85%|████████▍ | 1270/1500 [13:42<02:28, 1.55it/s, loss=0.173, lr=1] Steps: 85%|████████▍ | 1271/1500 [13:43<02:27, 1.55it/s, loss=0.173, lr=1] Steps: 85%|████████▍ | 1271/1500 [13:43<02:27, 1.55it/s, loss=0.0739, lr=1] Steps: 85%|████████▍ | 1272/1500 [13:44<02:27, 1.55it/s, loss=0.0739, lr=1] Steps: 85%|████████▍ | 1272/1500 [13:44<02:27, 1.55it/s, loss=0.161, lr=1] Steps: 85%|████████▍ | 1273/1500 [13:44<02:26, 1.55it/s, loss=0.161, lr=1] Steps: 85%|████████▍ | 1273/1500 [13:44<02:26, 1.55it/s, loss=0.138, lr=1] Steps: 85%|████████▍ | 1274/1500 [13:45<02:26, 1.55it/s, loss=0.138, lr=1] Steps: 85%|████████▍ | 1274/1500 [13:45<02:26, 1.55it/s, loss=0.166, lr=1] Steps: 85%|████████▌ | 1275/1500 [13:46<02:25, 1.55it/s, loss=0.166, lr=1] Steps: 85%|████████▌ | 1275/1500 [13:46<02:25, 1.55it/s, loss=0.103, lr=1] Steps: 85%|████████▌ | 1276/1500 [13:46<02:24, 1.55it/s, loss=0.103, lr=1] Steps: 85%|████████▌ | 1276/1500 [13:46<02:24, 1.55it/s, loss=0.193, lr=1] Steps: 85%|████████▌ | 1277/1500 [13:47<02:23, 1.55it/s, loss=0.193, lr=1] Steps: 85%|████████▌ | 1277/1500 [13:47<02:23, 1.55it/s, loss=0.382, lr=1] Steps: 85%|████████▌ | 1278/1500 [13:48<02:23, 1.55it/s, loss=0.382, lr=1] Steps: 85%|████████▌ | 1278/1500 [13:48<02:23, 1.55it/s, loss=0.211, lr=1] Steps: 85%|████████▌ | 1279/1500 [13:48<02:22, 1.55it/s, loss=0.211, lr=1] Steps: 85%|████████▌ | 1279/1500 [13:48<02:22, 1.55it/s, loss=0.0598, lr=1] Steps: 85%|████████▌ | 1280/1500 [13:49<02:21, 1.55it/s, loss=0.0598, lr=1] Steps: 85%|████████▌ | 1280/1500 [13:49<02:21, 1.55it/s, loss=0.158, lr=1] Steps: 85%|████████▌ | 1281/1500 [13:49<02:22, 1.54it/s, loss=0.158, lr=1] Steps: 85%|████████▌ | 1281/1500 [13:49<02:22, 1.54it/s, loss=0.139, lr=1] Steps: 85%|████████▌ | 1282/1500 [13:50<02:21, 1.54it/s, loss=0.139, lr=1] Steps: 85%|████████▌ | 1282/1500 [13:50<02:21, 1.54it/s, loss=0.189, lr=1] Steps: 86%|████████▌ | 1283/1500 [13:51<02:20, 1.54it/s, loss=0.189, lr=1] Steps: 86%|████████▌ | 1283/1500 [13:51<02:20, 1.54it/s, loss=0.108, lr=1] Steps: 86%|████████▌ | 1284/1500 [13:51<02:19, 1.55it/s, loss=0.108, lr=1] Steps: 86%|████████▌ | 1284/1500 [13:51<02:19, 1.55it/s, loss=0.0843, lr=1] Steps: 86%|████████▌ | 1285/1500 [13:52<02:18, 1.55it/s, loss=0.0843, lr=1] Steps: 86%|████████▌ | 1285/1500 [13:52<02:18, 1.55it/s, loss=0.0307, lr=1] Steps: 86%|████████▌ | 1286/1500 [13:53<02:18, 1.55it/s, loss=0.0307, lr=1] Steps: 86%|████████▌ | 1286/1500 [13:53<02:18, 1.55it/s, loss=0.181, lr=1] Steps: 86%|████████▌ | 1287/1500 [13:53<02:17, 1.55it/s, loss=0.181, lr=1] Steps: 86%|████████▌ | 1287/1500 [13:53<02:17, 1.55it/s, loss=0.126, lr=1] Steps: 86%|████████▌ | 1288/1500 [13:54<02:17, 1.55it/s, loss=0.126, lr=1] Steps: 86%|████████▌ | 1288/1500 [13:54<02:17, 1.55it/s, loss=0.0757, lr=1] Steps: 86%|████████▌ | 1289/1500 [13:55<02:16, 1.55it/s, loss=0.0757, lr=1] Steps: 86%|████████▌ | 1289/1500 [13:55<02:16, 1.55it/s, loss=0.449, lr=1] Steps: 86%|████████▌ | 1290/1500 [13:55<02:15, 1.55it/s, loss=0.449, lr=1] Steps: 86%|████████▌ | 1290/1500 [13:55<02:15, 1.55it/s, loss=0.16, lr=1] Steps: 86%|████████▌ | 1291/1500 [13:56<02:15, 1.55it/s, loss=0.16, lr=1] Steps: 86%|████████▌ | 1291/1500 [13:56<02:15, 1.55it/s, loss=0.23, lr=1] Steps: 86%|████████▌ | 1292/1500 [13:57<02:14, 1.55it/s, loss=0.23, lr=1] Steps: 86%|████████▌ | 1292/1500 [13:57<02:14, 1.55it/s, loss=0.0799, lr=1] Steps: 86%|████████▌ | 1293/1500 [13:57<02:13, 1.55it/s, loss=0.0799, lr=1] Steps: 86%|████████▌ | 1293/1500 [13:57<02:13, 1.55it/s, loss=0.152, lr=1] Steps: 86%|████████▋ | 1294/1500 [13:58<02:13, 1.55it/s, loss=0.152, lr=1] Steps: 86%|████████▋ | 1294/1500 [13:58<02:13, 1.55it/s, loss=0.0529, lr=1] Steps: 86%|████████▋ | 1295/1500 [13:59<02:12, 1.55it/s, loss=0.0529, lr=1] Steps: 86%|████████▋ | 1295/1500 [13:59<02:12, 1.55it/s, loss=0.154, lr=1] Steps: 86%|████████▋ | 1296/1500 [13:59<02:11, 1.55it/s, loss=0.154, lr=1] Steps: 86%|████████▋ | 1296/1500 [13:59<02:11, 1.55it/s, loss=0.12, lr=1] Steps: 86%|████████▋ | 1297/1500 [14:00<02:11, 1.54it/s, loss=0.12, lr=1] Steps: 86%|████████▋ | 1297/1500 [14:00<02:11, 1.54it/s, loss=0.105, lr=1] Steps: 87%|████████▋ | 1298/1500 [14:00<02:10, 1.54it/s, loss=0.105, lr=1] Steps: 87%|████████▋ | 1298/1500 [14:00<02:10, 1.54it/s, loss=0.0355, lr=1] Steps: 87%|████████▋ | 1299/1500 [14:01<02:09, 1.55it/s, loss=0.0355, lr=1] Steps: 87%|████████▋ | 1299/1500 [14:01<02:09, 1.55it/s, loss=0.15, lr=1] Steps: 87%|████████▋ | 1300/1500 [14:02<02:09, 1.55it/s, loss=0.15, lr=1] Steps: 87%|████████▋ | 1300/1500 [14:02<02:09, 1.55it/s, loss=0.148, lr=1] Steps: 87%|████████▋ | 1301/1500 [14:02<02:08, 1.55it/s, loss=0.148, lr=1] Steps: 87%|████████▋ | 1301/1500 [14:02<02:08, 1.55it/s, loss=0.186, lr=1] Steps: 87%|████████▋ | 1302/1500 [14:03<02:07, 1.55it/s, loss=0.186, lr=1] Steps: 87%|████████▋ | 1302/1500 [14:03<02:07, 1.55it/s, loss=0.0771, lr=1] Steps: 87%|████████▋ | 1303/1500 [14:04<02:07, 1.55it/s, loss=0.0771, lr=1] Steps: 87%|████████▋ | 1303/1500 [14:04<02:07, 1.55it/s, loss=0.161, lr=1] Steps: 87%|████████▋ | 1304/1500 [14:04<02:06, 1.55it/s, loss=0.161, lr=1] Steps: 87%|████████▋ | 1304/1500 [14:04<02:06, 1.55it/s, loss=0.104, lr=1] Steps: 87%|████████▋ | 1305/1500 [14:05<02:05, 1.55it/s, loss=0.104, lr=1] Steps: 87%|████████▋ | 1305/1500 [14:05<02:05, 1.55it/s, loss=0.112, lr=1] Steps: 87%|████████▋ | 1306/1500 [14:06<02:05, 1.55it/s, loss=0.112, lr=1] Steps: 87%|████████▋ | 1306/1500 [14:06<02:05, 1.55it/s, loss=0.25, lr=1] Steps: 87%|████████▋ | 1307/1500 [14:06<02:04, 1.55it/s, loss=0.25, lr=1] Steps: 87%|████████▋ | 1307/1500 [14:06<02:04, 1.55it/s, loss=0.0566, lr=1] Steps: 87%|████████▋ | 1308/1500 [14:07<02:04, 1.55it/s, loss=0.0566, lr=1] Steps: 87%|████████▋ | 1308/1500 [14:07<02:04, 1.55it/s, loss=0.164, lr=1] Steps: 87%|████████▋ | 1309/1500 [14:08<02:03, 1.55it/s, loss=0.164, lr=1] Steps: 87%|████████▋ | 1309/1500 [14:08<02:03, 1.55it/s, loss=0.165, lr=1] Steps: 87%|████████▋ | 1310/1500 [14:08<02:02, 1.55it/s, loss=0.165, lr=1] Steps: 87%|████████▋ | 1310/1500 [14:08<02:02, 1.55it/s, loss=0.15, lr=1] Steps: 87%|████████▋ | 1311/1500 [14:09<02:02, 1.55it/s, loss=0.15, lr=1] Steps: 87%|████████▋ | 1311/1500 [14:09<02:02, 1.55it/s, loss=0.136, lr=1] Steps: 87%|████████▋ | 1312/1500 [14:10<02:01, 1.55it/s, loss=0.136, lr=1] Steps: 87%|████████▋ | 1312/1500 [14:10<02:01, 1.55it/s, loss=0.104, lr=1] Steps: 88%|████████▊ | 1313/1500 [14:10<02:01, 1.54it/s, loss=0.104, lr=1] Steps: 88%|████████▊ | 1313/1500 [14:10<02:01, 1.54it/s, loss=0.103, lr=1] Steps: 88%|████████▊ | 1314/1500 [14:11<02:00, 1.54it/s, loss=0.103, lr=1] Steps: 88%|████████▊ | 1314/1500 [14:11<02:00, 1.54it/s, loss=0.0435, lr=1] Steps: 88%|████████▊ | 1315/1500 [14:11<01:59, 1.54it/s, loss=0.0435, lr=1] Steps: 88%|████████▊ | 1315/1500 [14:11<01:59, 1.54it/s, loss=0.133, lr=1] Steps: 88%|████████▊ | 1316/1500 [14:12<01:59, 1.55it/s, loss=0.133, lr=1] Steps: 88%|████████▊ | 1316/1500 [14:12<01:59, 1.55it/s, loss=0.177, lr=1] Steps: 88%|████████▊ | 1317/1500 [14:13<01:58, 1.55it/s, loss=0.177, lr=1] Steps: 88%|████████▊ | 1317/1500 [14:13<01:58, 1.55it/s, loss=0.112, lr=1] Steps: 88%|████████▊ | 1318/1500 [14:13<01:57, 1.55it/s, loss=0.112, lr=1] Steps: 88%|████████▊ | 1318/1500 [14:13<01:57, 1.55it/s, loss=0.113, lr=1] Steps: 88%|████████▊ | 1319/1500 [14:14<01:56, 1.55it/s, loss=0.113, lr=1] Steps: 88%|████████▊ | 1319/1500 [14:14<01:56, 1.55it/s, loss=0.0744, lr=1] Steps: 88%|████████▊ | 1320/1500 [14:15<01:56, 1.55it/s, loss=0.0744, lr=1] Steps: 88%|████████▊ | 1320/1500 [14:15<01:56, 1.55it/s, loss=0.0429, lr=1] Steps: 88%|████████▊ | 1321/1500 [14:15<01:55, 1.55it/s, loss=0.0429, lr=1] Steps: 88%|████████▊ | 1321/1500 [14:15<01:55, 1.55it/s, loss=0.108, lr=1] Steps: 88%|████████▊ | 1322/1500 [14:16<01:54, 1.55it/s, loss=0.108, lr=1] Steps: 88%|████████▊ | 1322/1500 [14:16<01:54, 1.55it/s, loss=0.11, lr=1] Steps: 88%|████████▊ | 1323/1500 [14:17<01:54, 1.55it/s, loss=0.11, lr=1] Steps: 88%|████████▊ | 1323/1500 [14:17<01:54, 1.55it/s, loss=0.107, lr=1] Steps: 88%|████████▊ | 1324/1500 [14:17<01:53, 1.55it/s, loss=0.107, lr=1] Steps: 88%|████████▊ | 1324/1500 [14:17<01:53, 1.55it/s, loss=0.113, lr=1] Steps: 88%|████████▊ | 1325/1500 [14:18<01:52, 1.55it/s, loss=0.113, lr=1] Steps: 88%|████████▊ | 1325/1500 [14:18<01:52, 1.55it/s, loss=0.0766, lr=1] Steps: 88%|████████▊ | 1326/1500 [14:19<01:52, 1.55it/s, loss=0.0766, lr=1] Steps: 88%|████████▊ | 1326/1500 [14:19<01:52, 1.55it/s, loss=0.0635, lr=1] Steps: 88%|████████▊ | 1327/1500 [14:19<01:51, 1.55it/s, loss=0.0635, lr=1] Steps: 88%|████████▊ | 1327/1500 [14:19<01:51, 1.55it/s, loss=0.117, lr=1] Steps: 89%|████████▊ | 1328/1500 [14:20<01:50, 1.55it/s, loss=0.117, lr=1] Steps: 89%|████████▊ | 1328/1500 [14:20<01:50, 1.55it/s, loss=0.149, lr=1] Steps: 89%|████████▊ | 1329/1500 [14:21<01:50, 1.54it/s, loss=0.149, lr=1] Steps: 89%|████████▊ | 1329/1500 [14:21<01:50, 1.54it/s, loss=0.307, lr=1] Steps: 89%|████████▊ | 1330/1500 [14:21<01:49, 1.55it/s, loss=0.307, lr=1] Steps: 89%|████████▊ | 1330/1500 [14:21<01:49, 1.55it/s, loss=0.189, lr=1] Steps: 89%|████████▊ | 1331/1500 [14:22<01:49, 1.55it/s, loss=0.189, lr=1] Steps: 89%|████████▊ | 1331/1500 [14:22<01:49, 1.55it/s, loss=0.0944, lr=1] Steps: 89%|████████▉ | 1332/1500 [14:22<01:48, 1.55it/s, loss=0.0944, lr=1] Steps: 89%|████████▉ | 1332/1500 [14:22<01:48, 1.55it/s, loss=0.122, lr=1] Steps: 89%|████████▉ | 1333/1500 [14:23<01:47, 1.55it/s, loss=0.122, lr=1] Steps: 89%|████████▉ | 1333/1500 [14:23<01:47, 1.55it/s, loss=0.0858, lr=1] Steps: 89%|████████▉ | 1334/1500 [14:24<01:47, 1.55it/s, loss=0.0858, lr=1] Steps: 89%|████████▉ | 1334/1500 [14:24<01:47, 1.55it/s, loss=0.127, lr=1] Steps: 89%|████████▉ | 1335/1500 [14:24<01:46, 1.55it/s, loss=0.127, lr=1] Steps: 89%|████████▉ | 1335/1500 [14:24<01:46, 1.55it/s, loss=0.134, lr=1] Steps: 89%|████████▉ | 1336/1500 [14:25<01:45, 1.55it/s, loss=0.134, lr=1] Steps: 89%|████████▉ | 1336/1500 [14:25<01:45, 1.55it/s, loss=0.162, lr=1] Steps: 89%|████████▉ | 1337/1500 [14:26<01:45, 1.55it/s, loss=0.162, lr=1] Steps: 89%|████████▉ | 1337/1500 [14:26<01:45, 1.55it/s, loss=0.037, lr=1] Steps: 89%|████████▉ | 1338/1500 [14:26<01:44, 1.55it/s, loss=0.037, lr=1] Steps: 89%|████████▉ | 1338/1500 [14:26<01:44, 1.55it/s, loss=0.0404, lr=1] Steps: 89%|████████▉ | 1339/1500 [14:27<01:44, 1.55it/s, loss=0.0404, lr=1] Steps: 89%|████████▉ | 1339/1500 [14:27<01:44, 1.55it/s, loss=0.127, lr=1] Steps: 89%|████████▉ | 1340/1500 [14:28<01:43, 1.55it/s, loss=0.127, lr=1] Steps: 89%|████████▉ | 1340/1500 [14:28<01:43, 1.55it/s, loss=0.16, lr=1] Steps: 89%|████████▉ | 1341/1500 [14:28<01:42, 1.55it/s, loss=0.16, lr=1] Steps: 89%|████████▉ | 1341/1500 [14:28<01:42, 1.55it/s, loss=0.265, lr=1] Steps: 89%|████████▉ | 1342/1500 [14:29<01:42, 1.55it/s, loss=0.265, lr=1] Steps: 89%|████████▉ | 1342/1500 [14:29<01:42, 1.55it/s, loss=0.0393, lr=1] Steps: 90%|████████▉ | 1343/1500 [14:30<01:41, 1.55it/s, loss=0.0393, lr=1] Steps: 90%|████████▉ | 1343/1500 [14:30<01:41, 1.55it/s, loss=0.0751, lr=1] Steps: 90%|████████▉ | 1344/1500 [14:30<01:40, 1.55it/s, loss=0.0751, lr=1] Steps: 90%|████████▉ | 1344/1500 [14:30<01:40, 1.55it/s, loss=0.0515, lr=1] Steps: 90%|████████▉ | 1345/1500 [14:31<01:46, 1.46it/s, loss=0.0515, lr=1] Steps: 90%|████████▉ | 1345/1500 [14:31<01:46, 1.46it/s, loss=0.115, lr=1] Steps: 90%|████████▉ | 1346/1500 [14:32<01:43, 1.48it/s, loss=0.115, lr=1] Steps: 90%|████████▉ | 1346/1500 [14:32<01:43, 1.48it/s, loss=0.114, lr=1] Steps: 90%|████████▉ | 1347/1500 [14:32<01:41, 1.50it/s, loss=0.114, lr=1] Steps: 90%|████████▉ | 1347/1500 [14:32<01:41, 1.50it/s, loss=0.0997, lr=1] Steps: 90%|████████▉ | 1348/1500 [14:33<01:40, 1.52it/s, loss=0.0997, lr=1] Steps: 90%|████████▉ | 1348/1500 [14:33<01:40, 1.52it/s, loss=0.105, lr=1] Steps: 90%|████████▉ | 1349/1500 [14:34<01:38, 1.53it/s, loss=0.105, lr=1] Steps: 90%|████████▉ | 1349/1500 [14:34<01:38, 1.53it/s, loss=0.125, lr=1] Steps: 90%|█████████ | 1350/1500 [14:34<01:37, 1.54it/s, loss=0.125, lr=1] Steps: 90%|█████████ | 1350/1500 [14:34<01:37, 1.54it/s, loss=0.126, lr=1] Steps: 90%|█████████ | 1351/1500 [14:35<01:36, 1.54it/s, loss=0.126, lr=1] Steps: 90%|█████████ | 1351/1500 [14:35<01:36, 1.54it/s, loss=0.0936, lr=1] Steps: 90%|█████████ | 1352/1500 [14:35<01:35, 1.54it/s, loss=0.0936, lr=1] Steps: 90%|█████████ | 1352/1500 [14:35<01:35, 1.54it/s, loss=0.242, lr=1] Steps: 90%|█████████ | 1353/1500 [14:36<01:35, 1.55it/s, loss=0.242, lr=1] Steps: 90%|█████████ | 1353/1500 [14:36<01:35, 1.55it/s, loss=0.117, lr=1] Steps: 90%|█████████ | 1354/1500 [14:37<01:34, 1.55it/s, loss=0.117, lr=1] Steps: 90%|█████████ | 1354/1500 [14:37<01:34, 1.55it/s, loss=0.173, lr=1] Steps: 90%|█████████ | 1355/1500 [14:37<01:33, 1.55it/s, loss=0.173, lr=1] Steps: 90%|█████████ | 1355/1500 [14:37<01:33, 1.55it/s, loss=0.149, lr=1] Steps: 90%|█████████ | 1356/1500 [14:38<01:32, 1.55it/s, loss=0.149, lr=1] Steps: 90%|█████████ | 1356/1500 [14:38<01:32, 1.55it/s, loss=0.0857, lr=1] Steps: 90%|█████████ | 1357/1500 [14:39<01:32, 1.55it/s, loss=0.0857, lr=1] Steps: 90%|█████████ | 1357/1500 [14:39<01:32, 1.55it/s, loss=0.0996, lr=1] Steps: 91%|█████████ | 1358/1500 [14:39<01:31, 1.55it/s, loss=0.0996, lr=1] Steps: 91%|█████████ | 1358/1500 [14:39<01:31, 1.55it/s, loss=0.147, lr=1] Steps: 91%|█████████ | 1359/1500 [14:40<01:30, 1.55it/s, loss=0.147, lr=1] Steps: 91%|█████████ | 1359/1500 [14:40<01:30, 1.55it/s, loss=0.17, lr=1] Steps: 91%|█████████ | 1360/1500 [14:41<01:30, 1.55it/s, loss=0.17, lr=1] Steps: 91%|█████████ | 1360/1500 [14:41<01:30, 1.55it/s, loss=0.113, lr=1] Steps: 91%|█████████ | 1361/1500 [14:41<01:30, 1.54it/s, loss=0.113, lr=1] Steps: 91%|█████████ | 1361/1500 [14:41<01:30, 1.54it/s, loss=0.0778, lr=1] Steps: 91%|█████████ | 1362/1500 [14:42<01:29, 1.54it/s, loss=0.0778, lr=1] Steps: 91%|█████████ | 1362/1500 [14:42<01:29, 1.54it/s, loss=0.0827, lr=1] Steps: 91%|█████████ | 1363/1500 [14:43<01:28, 1.55it/s, loss=0.0827, lr=1] Steps: 91%|█████████ | 1363/1500 [14:43<01:28, 1.55it/s, loss=0.0688, lr=1] Steps: 91%|█████████ | 1364/1500 [14:43<01:27, 1.55it/s, loss=0.0688, lr=1] Steps: 91%|█████████ | 1364/1500 [14:43<01:27, 1.55it/s, loss=0.101, lr=1] Steps: 91%|█████████ | 1365/1500 [14:44<01:27, 1.55it/s, loss=0.101, lr=1] Steps: 91%|█████████ | 1365/1500 [14:44<01:27, 1.55it/s, loss=0.147, lr=1] Steps: 91%|█████████ | 1366/1500 [14:45<01:26, 1.55it/s, loss=0.147, lr=1] Steps: 91%|█████████ | 1366/1500 [14:45<01:26, 1.55it/s, loss=0.0731, lr=1] Steps: 91%|█████████ | 1367/1500 [14:45<01:25, 1.55it/s, loss=0.0731, lr=1] Steps: 91%|█████████ | 1367/1500 [14:45<01:25, 1.55it/s, loss=0.164, lr=1] Steps: 91%|█████████ | 1368/1500 [14:46<01:25, 1.55it/s, loss=0.164, lr=1] Steps: 91%|█████████ | 1368/1500 [14:46<01:25, 1.55it/s, loss=0.0986, lr=1] Steps: 91%|█████████▏| 1369/1500 [14:46<01:24, 1.55it/s, loss=0.0986, lr=1] Steps: 91%|█████████▏| 1369/1500 [14:46<01:24, 1.55it/s, loss=0.0539, lr=1] Steps: 91%|█████████▏| 1370/1500 [14:47<01:23, 1.55it/s, loss=0.0539, lr=1] Steps: 91%|█████████▏| 1370/1500 [14:47<01:23, 1.55it/s, loss=0.106, lr=1] Steps: 91%|█████████▏| 1371/1500 [14:48<01:23, 1.55it/s, loss=0.106, lr=1] Steps: 91%|█████████▏| 1371/1500 [14:48<01:23, 1.55it/s, loss=0.069, lr=1] Steps: 91%|█████████▏| 1372/1500 [14:48<01:22, 1.55it/s, loss=0.069, lr=1] Steps: 91%|█████████▏| 1372/1500 [14:48<01:22, 1.55it/s, loss=0.0801, lr=1] Steps: 92%|█████████▏| 1373/1500 [14:49<01:21, 1.55it/s, loss=0.0801, lr=1] Steps: 92%|█████████▏| 1373/1500 [14:49<01:21, 1.55it/s, loss=0.258, lr=1] Steps: 92%|█████████▏| 1374/1500 [14:50<01:21, 1.55it/s, loss=0.258, lr=1] Steps: 92%|█████████▏| 1374/1500 [14:50<01:21, 1.55it/s, loss=0.0993, lr=1] Steps: 92%|█████████▏| 1375/1500 [14:50<01:20, 1.55it/s, loss=0.0993, lr=1] Steps: 92%|█████████▏| 1375/1500 [14:50<01:20, 1.55it/s, loss=0.114, lr=1] Steps: 92%|█████████▏| 1376/1500 [14:51<01:19, 1.55it/s, loss=0.114, lr=1] Steps: 92%|█████████▏| 1376/1500 [14:51<01:19, 1.55it/s, loss=0.15, lr=1] Steps: 92%|█████████▏| 1377/1500 [14:52<01:19, 1.54it/s, loss=0.15, lr=1] Steps: 92%|█████████▏| 1377/1500 [14:52<01:19, 1.54it/s, loss=0.149, lr=1] Steps: 92%|█████████▏| 1378/1500 [14:52<01:18, 1.55it/s, loss=0.149, lr=1] Steps: 92%|█████████▏| 1378/1500 [14:52<01:18, 1.55it/s, loss=0.106, lr=1] Steps: 92%|█████████▏| 1379/1500 [14:53<01:18, 1.55it/s, loss=0.106, lr=1] Steps: 92%|█████████▏| 1379/1500 [14:53<01:18, 1.55it/s, loss=0.0788, lr=1] Steps: 92%|█████████▏| 1380/1500 [14:54<01:17, 1.55it/s, loss=0.0788, lr=1] Steps: 92%|█████████▏| 1380/1500 [14:54<01:17, 1.55it/s, loss=0.157, lr=1] Steps: 92%|█████████▏| 1381/1500 [14:54<01:16, 1.55it/s, loss=0.157, lr=1] Steps: 92%|█████████▏| 1381/1500 [14:54<01:16, 1.55it/s, loss=0.101, lr=1] Steps: 92%|█████████▏| 1382/1500 [14:55<01:16, 1.55it/s, loss=0.101, lr=1] Steps: 92%|█████████▏| 1382/1500 [14:55<01:16, 1.55it/s, loss=0.0446, lr=1] Steps: 92%|█████████▏| 1383/1500 [14:55<01:15, 1.55it/s, loss=0.0446, lr=1] Steps: 92%|█████████▏| 1383/1500 [14:55<01:15, 1.55it/s, loss=0.107, lr=1] Steps: 92%|█████████▏| 1384/1500 [14:56<01:14, 1.55it/s, loss=0.107, lr=1] Steps: 92%|█████████▏| 1384/1500 [14:56<01:14, 1.55it/s, loss=0.144, lr=1] Steps: 92%|█████████▏| 1385/1500 [14:57<01:14, 1.55it/s, loss=0.144, lr=1] Steps: 92%|█████████▏| 1385/1500 [14:57<01:14, 1.55it/s, loss=0.0751, lr=1] Steps: 92%|█████████▏| 1386/1500 [14:57<01:13, 1.55it/s, loss=0.0751, lr=1] Steps: 92%|█████████▏| 1386/1500 [14:57<01:13, 1.55it/s, loss=0.0852, lr=1] Steps: 92%|█████████▏| 1387/1500 [14:58<01:12, 1.55it/s, loss=0.0852, lr=1] Steps: 92%|█████████▏| 1387/1500 [14:58<01:12, 1.55it/s, loss=0.119, lr=1] Steps: 93%|█████████▎| 1388/1500 [14:59<01:12, 1.55it/s, loss=0.119, lr=1] Steps: 93%|█████████▎| 1388/1500 [14:59<01:12, 1.55it/s, loss=0.205, lr=1] Steps: 93%|█████████▎| 1389/1500 [14:59<01:11, 1.55it/s, loss=0.205, lr=1] Steps: 93%|█████████▎| 1389/1500 [14:59<01:11, 1.55it/s, loss=0.112, lr=1] Steps: 93%|█████████▎| 1390/1500 [15:00<01:10, 1.55it/s, loss=0.112, lr=1] Steps: 93%|█████████▎| 1390/1500 [15:00<01:10, 1.55it/s, loss=0.203, lr=1] Steps: 93%|█████████▎| 1391/1500 [15:01<01:10, 1.55it/s, loss=0.203, lr=1] Steps: 93%|█████████▎| 1391/1500 [15:01<01:10, 1.55it/s, loss=0.157, lr=1] Steps: 93%|█████████▎| 1392/1500 [15:01<01:09, 1.55it/s, loss=0.157, lr=1] Steps: 93%|█████████▎| 1392/1500 [15:01<01:09, 1.55it/s, loss=0.102, lr=1] Steps: 93%|█████████▎| 1393/1500 [15:02<01:09, 1.54it/s, loss=0.102, lr=1] Steps: 93%|█████████▎| 1393/1500 [15:02<01:09, 1.54it/s, loss=0.0599, lr=1] Steps: 93%|█████████▎| 1394/1500 [15:03<01:08, 1.55it/s, loss=0.0599, lr=1] Steps: 93%|█████████▎| 1394/1500 [15:03<01:08, 1.55it/s, loss=0.0734, lr=1] Steps: 93%|█████████▎| 1395/1500 [15:03<01:07, 1.55it/s, loss=0.0734, lr=1] Steps: 93%|█████████▎| 1395/1500 [15:03<01:07, 1.55it/s, loss=0.151, lr=1] Steps: 93%|█████████▎| 1396/1500 [15:04<01:07, 1.54it/s, loss=0.151, lr=1] Steps: 93%|█████████▎| 1396/1500 [15:04<01:07, 1.54it/s, loss=0.17, lr=1] Steps: 93%|█████████▎| 1397/1500 [15:05<01:06, 1.55it/s, loss=0.17, lr=1] Steps: 93%|█████████▎| 1397/1500 [15:05<01:06, 1.55it/s, loss=0.0297, lr=1] Steps: 93%|█████████▎| 1398/1500 [15:05<01:05, 1.55it/s, loss=0.0297, lr=1] Steps: 93%|█████████▎| 1398/1500 [15:05<01:05, 1.55it/s, loss=0.125, lr=1] Steps: 93%|█████████▎| 1399/1500 [15:06<01:05, 1.55it/s, loss=0.125, lr=1] Steps: 93%|█████████▎| 1399/1500 [15:06<01:05, 1.55it/s, loss=0.151, lr=1] Steps: 93%|█████████▎| 1400/1500 [15:06<01:04, 1.55it/s, loss=0.151, lr=1] Steps: 93%|█████████▎| 1400/1500 [15:06<01:04, 1.55it/s, loss=0.06, lr=1] Steps: 93%|█████████▎| 1401/1500 [15:07<01:03, 1.55it/s, loss=0.06, lr=1] Steps: 93%|█████████▎| 1401/1500 [15:07<01:03, 1.55it/s, loss=0.074, lr=1] Steps: 93%|█████████▎| 1402/1500 [15:08<01:03, 1.55it/s, loss=0.074, lr=1] Steps: 93%|█████████▎| 1402/1500 [15:08<01:03, 1.55it/s, loss=0.181, lr=1] Steps: 94%|█████████▎| 1403/1500 [15:08<01:02, 1.55it/s, loss=0.181, lr=1] Steps: 94%|█████████▎| 1403/1500 [15:08<01:02, 1.55it/s, loss=0.121, lr=1] Steps: 94%|█████████▎| 1404/1500 [15:09<01:01, 1.55it/s, loss=0.121, lr=1] Steps: 94%|█████████▎| 1404/1500 [15:09<01:01, 1.55it/s, loss=0.106, lr=1] Steps: 94%|█████████▎| 1405/1500 [15:10<01:01, 1.55it/s, loss=0.106, lr=1] Steps: 94%|█████████▎| 1405/1500 [15:10<01:01, 1.55it/s, loss=0.147, lr=1] Steps: 94%|█████████▎| 1406/1500 [15:10<01:00, 1.55it/s, loss=0.147, lr=1] Steps: 94%|█████████▎| 1406/1500 [15:10<01:00, 1.55it/s, loss=0.19, lr=1] Steps: 94%|█████████▍| 1407/1500 [15:11<00:59, 1.55it/s, loss=0.19, lr=1] Steps: 94%|█████████▍| 1407/1500 [15:11<00:59, 1.55it/s, loss=0.0706, lr=1] Steps: 94%|█████████▍| 1408/1500 [15:12<00:59, 1.55it/s, loss=0.0706, lr=1] Steps: 94%|█████████▍| 1408/1500 [15:12<00:59, 1.55it/s, loss=0.193, lr=1] Steps: 94%|█████████▍| 1409/1500 [15:12<00:59, 1.54it/s, loss=0.193, lr=1] Steps: 94%|█████████▍| 1409/1500 [15:12<00:59, 1.54it/s, loss=0.204, lr=1] Steps: 94%|█████████▍| 1410/1500 [15:13<00:58, 1.54it/s, loss=0.204, lr=1] Steps: 94%|█████████▍| 1410/1500 [15:13<00:58, 1.54it/s, loss=0.0766, lr=1] Steps: 94%|█████████▍| 1411/1500 [15:14<00:57, 1.55it/s, loss=0.0766, lr=1] Steps: 94%|█████████▍| 1411/1500 [15:14<00:57, 1.55it/s, loss=0.113, lr=1] Steps: 94%|█████████▍| 1412/1500 [15:14<00:56, 1.55it/s, loss=0.113, lr=1] Steps: 94%|█████████▍| 1412/1500 [15:14<00:56, 1.55it/s, loss=0.032, lr=1] Steps: 94%|█████████▍| 1413/1500 [15:15<00:56, 1.55it/s, loss=0.032, lr=1] Steps: 94%|█████████▍| 1413/1500 [15:15<00:56, 1.55it/s, loss=0.104, lr=1] Steps: 94%|█████████▍| 1414/1500 [15:15<00:55, 1.55it/s, loss=0.104, lr=1] Steps: 94%|█████████▍| 1414/1500 [15:15<00:55, 1.55it/s, loss=0.109, lr=1] Steps: 94%|█████████▍| 1415/1500 [15:16<00:54, 1.55it/s, loss=0.109, lr=1] Steps: 94%|█████████▍| 1415/1500 [15:16<00:54, 1.55it/s, loss=0.187, lr=1] Steps: 94%|█████████▍| 1416/1500 [15:17<00:54, 1.55it/s, loss=0.187, lr=1] Steps: 94%|█████████▍| 1416/1500 [15:17<00:54, 1.55it/s, loss=0.128, lr=1] Steps: 94%|█████████▍| 1417/1500 [15:17<00:53, 1.55it/s, loss=0.128, lr=1] Steps: 94%|█████████▍| 1417/1500 [15:17<00:53, 1.55it/s, loss=0.169, lr=1] Steps: 95%|█████████▍| 1418/1500 [15:18<00:52, 1.55it/s, loss=0.169, lr=1] Steps: 95%|█████████▍| 1418/1500 [15:18<00:52, 1.55it/s, loss=0.127, lr=1] Steps: 95%|█████████▍| 1419/1500 [15:19<00:52, 1.55it/s, loss=0.127, lr=1] Steps: 95%|█████████▍| 1419/1500 [15:19<00:52, 1.55it/s, loss=0.0858, lr=1] Steps: 95%|█████████▍| 1420/1500 [15:19<00:51, 1.55it/s, loss=0.0858, lr=1] Steps: 95%|█████████▍| 1420/1500 [15:19<00:51, 1.55it/s, loss=0.113, lr=1] Steps: 95%|█████████▍| 1421/1500 [15:20<00:50, 1.55it/s, loss=0.113, lr=1] Steps: 95%|█████████▍| 1421/1500 [15:20<00:50, 1.55it/s, loss=0.107, lr=1] Steps: 95%|█████████▍| 1422/1500 [15:21<00:50, 1.55it/s, loss=0.107, lr=1] Steps: 95%|█████████▍| 1422/1500 [15:21<00:50, 1.55it/s, loss=0.0445, lr=1] Steps: 95%|█████████▍| 1423/1500 [15:21<00:49, 1.55it/s, loss=0.0445, lr=1] Steps: 95%|█████████▍| 1423/1500 [15:21<00:49, 1.55it/s, loss=0.0858, lr=1] Steps: 95%|█████████▍| 1424/1500 [15:22<00:49, 1.55it/s, loss=0.0858, lr=1] Steps: 95%|█████████▍| 1424/1500 [15:22<00:49, 1.55it/s, loss=0.14, lr=1] Steps: 95%|█████████▌| 1425/1500 [15:23<00:48, 1.54it/s, loss=0.14, lr=1] Steps: 95%|█████████▌| 1425/1500 [15:23<00:48, 1.54it/s, loss=0.0467, lr=1] Steps: 95%|█████████▌| 1426/1500 [15:23<00:47, 1.54it/s, loss=0.0467, lr=1] Steps: 95%|█████████▌| 1426/1500 [15:23<00:47, 1.54it/s, loss=0.211, lr=1] Steps: 95%|█████████▌| 1427/1500 [15:24<00:47, 1.54it/s, loss=0.211, lr=1] Steps: 95%|█████████▌| 1427/1500 [15:24<00:47, 1.54it/s, loss=0.0604, lr=1] Steps: 95%|█████████▌| 1428/1500 [15:25<00:46, 1.54it/s, loss=0.0604, lr=1] Steps: 95%|█████████▌| 1428/1500 [15:25<00:46, 1.54it/s, loss=0.1, lr=1] Steps: 95%|█████████▌| 1429/1500 [15:25<00:45, 1.55it/s, loss=0.1, lr=1] Steps: 95%|█████████▌| 1429/1500 [15:25<00:45, 1.55it/s, loss=0.127, lr=1] Steps: 95%|█████████▌| 1430/1500 [15:26<00:45, 1.55it/s, loss=0.127, lr=1] Steps: 95%|█████████▌| 1430/1500 [15:26<00:45, 1.55it/s, loss=0.132, lr=1] Steps: 95%|█████████▌| 1431/1500 [15:26<00:44, 1.55it/s, loss=0.132, lr=1] Steps: 95%|█████████▌| 1431/1500 [15:26<00:44, 1.55it/s, loss=0.19, lr=1] Steps: 95%|█████████▌| 1432/1500 [15:27<00:43, 1.55it/s, loss=0.19, lr=1] Steps: 95%|█████████▌| 1432/1500 [15:27<00:43, 1.55it/s, loss=0.0851, lr=1] Steps: 96%|█████████▌| 1433/1500 [15:28<00:43, 1.55it/s, loss=0.0851, lr=1] Steps: 96%|█████████▌| 1433/1500 [15:28<00:43, 1.55it/s, loss=0.145, lr=1] Steps: 96%|█████████▌| 1434/1500 [15:28<00:42, 1.55it/s, loss=0.145, lr=1] Steps: 96%|█████████▌| 1434/1500 [15:28<00:42, 1.55it/s, loss=0.296, lr=1] Steps: 96%|█████████▌| 1435/1500 [15:29<00:41, 1.55it/s, loss=0.296, lr=1] Steps: 96%|█████████▌| 1435/1500 [15:29<00:41, 1.55it/s, loss=0.0936, lr=1] Steps: 96%|█████████▌| 1436/1500 [15:30<00:41, 1.55it/s, loss=0.0936, lr=1] Steps: 96%|█████████▌| 1436/1500 [15:30<00:41, 1.55it/s, loss=0.0323, lr=1] Steps: 96%|█████████▌| 1437/1500 [15:30<00:40, 1.55it/s, loss=0.0323, lr=1] Steps: 96%|█████████▌| 1437/1500 [15:30<00:40, 1.55it/s, loss=0.114, lr=1] Steps: 96%|█████████▌| 1438/1500 [15:31<00:40, 1.55it/s, loss=0.114, lr=1] Steps: 96%|█████████▌| 1438/1500 [15:31<00:40, 1.55it/s, loss=0.0814, lr=1] Steps: 96%|█████████▌| 1439/1500 [15:32<00:39, 1.55it/s, loss=0.0814, lr=1] Steps: 96%|█████████▌| 1439/1500 [15:32<00:39, 1.55it/s, loss=0.103, lr=1] Steps: 96%|█████████▌| 1440/1500 [15:32<00:38, 1.55it/s, loss=0.103, lr=1] Steps: 96%|█████████▌| 1440/1500 [15:32<00:38, 1.55it/s, loss=0.123, lr=1] Steps: 96%|█████████▌| 1441/1500 [15:33<00:38, 1.54it/s, loss=0.123, lr=1] Steps: 96%|█████████▌| 1441/1500 [15:33<00:38, 1.54it/s, loss=0.064, lr=1] Steps: 96%|█████████▌| 1442/1500 [15:34<00:37, 1.54it/s, loss=0.064, lr=1] Steps: 96%|█████████▌| 1442/1500 [15:34<00:37, 1.54it/s, loss=0.148, lr=1] Steps: 96%|█████████▌| 1443/1500 [15:34<00:36, 1.54it/s, loss=0.148, lr=1] Steps: 96%|█████████▌| 1443/1500 [15:34<00:36, 1.54it/s, loss=0.154, lr=1] Steps: 96%|█████████▋| 1444/1500 [15:35<00:36, 1.54it/s, loss=0.154, lr=1] Steps: 96%|█████████▋| 1444/1500 [15:35<00:36, 1.54it/s, loss=0.0929, lr=1] Steps: 96%|█████████▋| 1445/1500 [15:36<00:35, 1.55it/s, loss=0.0929, lr=1] Steps: 96%|█████████▋| 1445/1500 [15:36<00:35, 1.55it/s, loss=0.124, lr=1] Steps: 96%|█████████▋| 1446/1500 [15:36<00:34, 1.55it/s, loss=0.124, lr=1] Steps: 96%|█████████▋| 1446/1500 [15:36<00:34, 1.55it/s, loss=0.0775, lr=1] Steps: 96%|█████████▋| 1447/1500 [15:37<00:34, 1.55it/s, loss=0.0775, lr=1] Steps: 96%|█████████▋| 1447/1500 [15:37<00:34, 1.55it/s, loss=0.0934, lr=1] Steps: 97%|█████████▋| 1448/1500 [15:37<00:33, 1.55it/s, loss=0.0934, lr=1] Steps: 97%|█████████▋| 1448/1500 [15:37<00:33, 1.55it/s, loss=0.121, lr=1] Steps: 97%|█████████▋| 1449/1500 [15:38<00:32, 1.55it/s, loss=0.121, lr=1] Steps: 97%|█████████▋| 1449/1500 [15:38<00:32, 1.55it/s, loss=0.186, lr=1] Steps: 97%|█████████▋| 1450/1500 [15:39<00:32, 1.55it/s, loss=0.186, lr=1] Steps: 97%|█████████▋| 1450/1500 [15:39<00:32, 1.55it/s, loss=0.106, lr=1] Steps: 97%|█████████▋| 1451/1500 [15:39<00:31, 1.55it/s, loss=0.106, lr=1] Steps: 97%|█████████▋| 1451/1500 [15:39<00:31, 1.55it/s, loss=0.11, lr=1] Steps: 97%|█████████▋| 1452/1500 [15:40<00:30, 1.56it/s, loss=0.11, lr=1] Steps: 97%|█████████▋| 1452/1500 [15:40<00:30, 1.56it/s, loss=0.0888, lr=1] Steps: 97%|█████████▋| 1453/1500 [15:41<00:30, 1.55it/s, loss=0.0888, lr=1] Steps: 97%|█████████▋| 1453/1500 [15:41<00:30, 1.55it/s, loss=0.0609, lr=1] Steps: 97%|█████████▋| 1454/1500 [15:41<00:29, 1.55it/s, loss=0.0609, lr=1] Steps: 97%|█████████▋| 1454/1500 [15:41<00:29, 1.55it/s, loss=0.114, lr=1] Steps: 97%|█████████▋| 1455/1500 [15:42<00:28, 1.55it/s, loss=0.114, lr=1] Steps: 97%|█████████▋| 1455/1500 [15:42<00:28, 1.55it/s, loss=0.131, lr=1] Steps: 97%|█████████▋| 1456/1500 [15:43<00:28, 1.55it/s, loss=0.131, lr=1] Steps: 97%|█████████▋| 1456/1500 [15:43<00:28, 1.55it/s, loss=0.123, lr=1] Steps: 97%|█████████▋| 1457/1500 [15:43<00:27, 1.54it/s, loss=0.123, lr=1] Steps: 97%|█████████▋| 1457/1500 [15:43<00:27, 1.54it/s, loss=0.147, lr=1] Steps: 97%|█████████▋| 1458/1500 [15:44<00:27, 1.54it/s, loss=0.147, lr=1] Steps: 97%|█████████▋| 1458/1500 [15:44<00:27, 1.54it/s, loss=0.0953, lr=1] Steps: 97%|█████████▋| 1459/1500 [15:45<00:26, 1.54it/s, loss=0.0953, lr=1] Steps: 97%|█████████▋| 1459/1500 [15:45<00:26, 1.54it/s, loss=0.03, lr=1] Steps: 97%|█████████▋| 1460/1500 [15:45<00:25, 1.55it/s, loss=0.03, lr=1] Steps: 97%|█████████▋| 1460/1500 [15:45<00:25, 1.55it/s, loss=0.136, lr=1] Steps: 97%|█████████▋| 1461/1500 [15:46<00:25, 1.55it/s, loss=0.136, lr=1] Steps: 97%|█████████▋| 1461/1500 [15:46<00:25, 1.55it/s, loss=0.206, lr=1] Steps: 97%|█████████▋| 1462/1500 [15:46<00:24, 1.55it/s, loss=0.206, lr=1] Steps: 97%|█████████▋| 1462/1500 [15:47<00:24, 1.55it/s, loss=0.0827, lr=1] Steps: 98%|█████████▊| 1463/1500 [15:47<00:23, 1.55it/s, loss=0.0827, lr=1] Steps: 98%|█████████▊| 1463/1500 [15:47<00:23, 1.55it/s, loss=0.124, lr=1] Steps: 98%|█████████▊| 1464/1500 [15:48<00:23, 1.55it/s, loss=0.124, lr=1] Steps: 98%|█████████▊| 1464/1500 [15:48<00:23, 1.55it/s, loss=0.083, lr=1] Steps: 98%|█████████▊| 1465/1500 [15:48<00:22, 1.55it/s, loss=0.083, lr=1] Steps: 98%|█████████▊| 1465/1500 [15:48<00:22, 1.55it/s, loss=0.141, lr=1] Steps: 98%|█████████▊| 1466/1500 [15:49<00:21, 1.55it/s, loss=0.141, lr=1] Steps: 98%|█████████▊| 1466/1500 [15:49<00:21, 1.55it/s, loss=0.118, lr=1] Steps: 98%|█████████▊| 1467/1500 [15:50<00:21, 1.55it/s, loss=0.118, lr=1] Steps: 98%|█████████▊| 1467/1500 [15:50<00:21, 1.55it/s, loss=0.105, lr=1] Steps: 98%|█████████▊| 1468/1500 [15:50<00:20, 1.55it/s, loss=0.105, lr=1] Steps: 98%|█████████▊| 1468/1500 [15:50<00:20, 1.55it/s, loss=0.129, lr=1] Steps: 98%|█████████▊| 1469/1500 [15:51<00:19, 1.55it/s, loss=0.129, lr=1] Steps: 98%|█████████▊| 1469/1500 [15:51<00:19, 1.55it/s, loss=0.12, lr=1] Steps: 98%|█████████▊| 1470/1500 [15:52<00:19, 1.55it/s, loss=0.12, lr=1] Steps: 98%|█████████▊| 1470/1500 [15:52<00:19, 1.55it/s, loss=0.19, lr=1] Steps: 98%|█████████▊| 1471/1500 [15:52<00:18, 1.55it/s, loss=0.19, lr=1] Steps: 98%|█████████▊| 1471/1500 [15:52<00:18, 1.55it/s, loss=0.154, lr=1] Steps: 98%|█████████▊| 1472/1500 [15:53<00:18, 1.55it/s, loss=0.154, lr=1] Steps: 98%|█████████▊| 1472/1500 [15:53<00:18, 1.55it/s, loss=0.0808, lr=1] Steps: 98%|█████████▊| 1473/1500 [15:54<00:17, 1.54it/s, loss=0.0808, lr=1] Steps: 98%|█████████▊| 1473/1500 [15:54<00:17, 1.54it/s, loss=0.128, lr=1] Steps: 98%|█████████▊| 1474/1500 [15:54<00:16, 1.55it/s, loss=0.128, lr=1] Steps: 98%|█████████▊| 1474/1500 [15:54<00:16, 1.55it/s, loss=0.126, lr=1] Steps: 98%|█████████▊| 1475/1500 [15:55<00:16, 1.55it/s, loss=0.126, lr=1] Steps: 98%|█████████▊| 1475/1500 [15:55<00:16, 1.55it/s, loss=0.0942, lr=1] Steps: 98%|█████████▊| 1476/1500 [15:56<00:15, 1.55it/s, loss=0.0942, lr=1] Steps: 98%|█████████▊| 1476/1500 [15:56<00:15, 1.55it/s, loss=0.167, lr=1] Steps: 98%|█████████▊| 1477/1500 [15:56<00:14, 1.55it/s, loss=0.167, lr=1] Steps: 98%|█████████▊| 1477/1500 [15:56<00:14, 1.55it/s, loss=0.11, lr=1] Steps: 99%|█████████▊| 1478/1500 [15:57<00:14, 1.55it/s, loss=0.11, lr=1] Steps: 99%|█████████▊| 1478/1500 [15:57<00:14, 1.55it/s, loss=0.176, lr=1] Steps: 99%|█████████▊| 1479/1500 [15:57<00:13, 1.55it/s, loss=0.176, lr=1] Steps: 99%|█████████▊| 1479/1500 [15:57<00:13, 1.55it/s, loss=0.0344, lr=1] Steps: 99%|█████████▊| 1480/1500 [15:58<00:12, 1.55it/s, loss=0.0344, lr=1] Steps: 99%|█████████▊| 1480/1500 [15:58<00:12, 1.55it/s, loss=0.181, lr=1] Steps: 99%|█████████▊| 1481/1500 [15:59<00:12, 1.55it/s, loss=0.181, lr=1] Steps: 99%|█████████▊| 1481/1500 [15:59<00:12, 1.55it/s, loss=0.103, lr=1] Steps: 99%|█████████▉| 1482/1500 [15:59<00:11, 1.55it/s, loss=0.103, lr=1] Steps: 99%|█████████▉| 1482/1500 [15:59<00:11, 1.55it/s, loss=0.27, lr=1] Steps: 99%|█████████▉| 1483/1500 [16:00<00:10, 1.55it/s, loss=0.27, lr=1] Steps: 99%|█████████▉| 1483/1500 [16:00<00:10, 1.55it/s, loss=0.154, lr=1] Steps: 99%|█████████▉| 1484/1500 [16:01<00:10, 1.55it/s, loss=0.154, lr=1] Steps: 99%|█████████▉| 1484/1500 [16:01<00:10, 1.55it/s, loss=0.154, lr=1] Steps: 99%|█████████▉| 1485/1500 [16:01<00:09, 1.55it/s, loss=0.154, lr=1] Steps: 99%|█████████▉| 1485/1500 [16:01<00:09, 1.55it/s, loss=0.107, lr=1] Steps: 99%|█████████▉| 1486/1500 [16:02<00:09, 1.55it/s, loss=0.107, lr=1] Steps: 99%|█████████▉| 1486/1500 [16:02<00:09, 1.55it/s, loss=0.304, lr=1] Steps: 99%|█████████▉| 1487/1500 [16:03<00:08, 1.55it/s, loss=0.304, lr=1] Steps: 99%|█████████▉| 1487/1500 [16:03<00:08, 1.55it/s, loss=0.218, lr=1] Steps: 99%|█████████▉| 1488/1500 [16:03<00:07, 1.55it/s, loss=0.218, lr=1] Steps: 99%|█████████▉| 1488/1500 [16:03<00:07, 1.55it/s, loss=0.137, lr=1] Steps: 99%|█████████▉| 1489/1500 [16:04<00:07, 1.53it/s, loss=0.137, lr=1] Steps: 99%|█████████▉| 1489/1500 [16:04<00:07, 1.53it/s, loss=0.0592, lr=1] Steps: 99%|█████████▉| 1490/1500 [16:05<00:06, 1.54it/s, loss=0.0592, lr=1] Steps: 99%|█████████▉| 1490/1500 [16:05<00:06, 1.54it/s, loss=0.0488, lr=1] Steps: 99%|█████████▉| 1491/1500 [16:05<00:05, 1.54it/s, loss=0.0488, lr=1] Steps: 99%|█████████▉| 1491/1500 [16:05<00:05, 1.54it/s, loss=0.0964, lr=1] Steps: 99%|█████████▉| 1492/1500 [16:06<00:05, 1.54it/s, loss=0.0964, lr=1] Steps: 99%|█████████▉| 1492/1500 [16:06<00:05, 1.54it/s, loss=0.178, lr=1] Steps: 100%|█████████▉| 1493/1500 [16:07<00:04, 1.54it/s, loss=0.178, lr=1] Steps: 100%|█████████▉| 1493/1500 [16:07<00:04, 1.54it/s, loss=0.259, lr=1] Steps: 100%|█████████▉| 1494/1500 [16:07<00:03, 1.55it/s, loss=0.259, lr=1] Steps: 100%|█████████▉| 1494/1500 [16:07<00:03, 1.55it/s, loss=0.0837, lr=1] Steps: 100%|█████████▉| 1495/1500 [16:08<00:03, 1.55it/s, loss=0.0837, lr=1] Steps: 100%|█████████▉| 1495/1500 [16:08<00:03, 1.55it/s, loss=0.198, lr=1] Steps: 100%|█████████▉| 1496/1500 [16:08<00:02, 1.55it/s, loss=0.198, lr=1] Steps: 100%|█████████▉| 1496/1500 [16:08<00:02, 1.55it/s, loss=0.146, lr=1] Steps: 100%|█████████▉| 1497/1500 [16:09<00:01, 1.55it/s, loss=0.146, lr=1] Steps: 100%|█████████▉| 1497/1500 [16:09<00:01, 1.55it/s, loss=0.186, lr=1] Steps: 100%|█████████▉| 1498/1500 [16:10<00:01, 1.55it/s, loss=0.186, lr=1] Steps: 100%|█████████▉| 1498/1500 [16:10<00:01, 1.55it/s, loss=0.15, lr=1] Steps: 100%|█████████▉| 1499/1500 [16:10<00:00, 1.55it/s, loss=0.15, lr=1] Steps: 100%|█████████▉| 1499/1500 [16:10<00:00, 1.55it/s, loss=0.105, lr=1] Steps: 100%|██████████| 1500/1500 [16:11<00:00, 1.55it/s, loss=0.105, lr=1] Steps: 100%|██████████| 1500/1500 [16:11<00:00, 1.55it/s, loss=0.0747, lr=1]Model weights saved in /tmp/train/output/sd35_large_train_replicate/pytorch_lora_weights.safetensors Loading pipeline components...: 0%| | 0/9 [00:00<?, ?it/s]Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of stable-diffusion-3.5-large. Loaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of stable-diffusion-3.5-large. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|█████ | 1/2 [00:05<00:05, 5.20s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.93s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00, 4.97s/it] Loaded text_encoder_3 as T5EncoderModel from `text_encoder_3` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 33%|███▎ | 3/9 [00:10<00:20, 3.36s/it]{'dual_attention_layers'} was not found in config. Values will be initialized to default values. Loaded transformer as SD3Transformer2DModel from `transformer` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 44%|████▍ | 4/9 [00:12<00:14, 2.90s/it]{'max_image_seq_len', 'base_image_seq_len', 'max_shift', 'base_shift', 'use_dynamic_shifting'} was not found in config. Values will be initialized to default values. Loaded scheduler as FlowMatchEulerDiscreteScheduler from `scheduler` subfolder of stable-diffusion-3.5-large. Loaded text_encoder_2 as CLIPTextModelWithProjection from `text_encoder_2` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 67%|██████▋ | 6/9 [00:13<00:05, 1.91s/it]Loaded text_encoder as CLIPTextModelWithProjection from `text_encoder` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 78%|███████▊ | 7/9 [00:14<00:03, 1.53s/it]Loaded vae as AutoencoderKL from `vae` subfolder of stable-diffusion-3.5-large. Loaded tokenizer_3 as T5TokenizerFast from `tokenizer_3` subfolder of stable-diffusion-3.5-large. Loading pipeline components...: 100%|██████████| 9/9 [00:14<00:00, 1.07it/s] Loading pipeline components...: 100%|██████████| 9/9 [00:14<00:00, 1.59s/it] Loading text_encoder. Loading text_encoder_2. 0%| | 0/1 [00:00<?, ?it/s] 100%|██████████| 1/1 [00:01<00:00, 1.11s/it] 100%|██████████| 1/1 [00:01<00:00, 1.11s/it] Steps: 100%|██████████| 1500/1500 [16:29<00:00, 1.52it/s, loss=0.0747, lr=1] ./ ./output/ ./output/sd35_large_train_replicate/ ./output/sd35_large_train_replicate/README.md ./output/sd35_large_train_replicate/lora.safetensors
Want to make some of these yourself?
Run this model