Prediction
genmoai/mochi-1-lora-trainer:170ea99fb48a30fef98cb1c9fb403a2882ab9d60c2ba15ad9383ace33c3fa385Input
- seed
- 42
- steps
- 500
- hf_token
- ████████████████████
This value was redacted after being sent to the model.
- optimizer
- adamw
- batch_size
- 1
- hf_repo_id
- lucataco/mochi-lora-vhs
- compile_dit
- input_videos
- vhs-4.zip
- learning_rate
- 0.0004
- trim_and_crop
- caption_dropout
- 0.1
{
"seed": 42,
"steps": 500,
"hf_token": "[REDACTED]",
"optimizer": "adamw",
"batch_size": 1,
"hf_repo_id": "lucataco/mochi-lora-vhs",
"compile_dit": true,
"input_videos": "https://replicate.delivery/pbxt/M7v7tzysZ9DiC0Pdla5rGUDW9tafeTOKzx9iS5fiqhehavkX/vhs-4.zip",
"learning_rate": 0.0004,
"trim_and_crop": true,
"caption_dropout": 0.1
}
npm install replicate
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
import Replicate from "replicate";
const replicate = new Replicate({
auth: process.env.REPLICATE_API_TOKEN,
});
Run genmoai/mochi-1-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const output = await replicate.run(
"genmoai/mochi-1-lora-trainer:170ea99fb48a30fef98cb1c9fb403a2882ab9d60c2ba15ad9383ace33c3fa385",
{
input: {
seed: 42,
steps: 500,
hf_token: "[REDACTED]",
optimizer: "adamw",
batch_size: 1,
hf_repo_id: "lucataco/mochi-lora-vhs",
compile_dit: true,
input_videos: "https://replicate.delivery/pbxt/M7v7tzysZ9DiC0Pdla5rGUDW9tafeTOKzx9iS5fiqhehavkX/vhs-4.zip",
learning_rate: 0.0004,
trim_and_crop: true,
caption_dropout: 0.1
}
}
);
console.log(output);
To learn more, take a look at the guide on getting started with Node.js.
pip install replicate
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
import replicate
Run genmoai/mochi-1-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run(
"genmoai/mochi-1-lora-trainer:170ea99fb48a30fef98cb1c9fb403a2882ab9d60c2ba15ad9383ace33c3fa385",
input={
"seed": 42,
"steps": 500,
"hf_token": "[REDACTED]",
"optimizer": "adamw",
"batch_size": 1,
"hf_repo_id": "lucataco/mochi-lora-vhs",
"compile_dit": True,
"input_videos": "https://replicate.delivery/pbxt/M7v7tzysZ9DiC0Pdla5rGUDW9tafeTOKzx9iS5fiqhehavkX/vhs-4.zip",
"learning_rate": 0.0004,
"trim_and_crop": True,
"caption_dropout": 0.1
}
)
print(output)
To learn more, take a look at the guide on getting started with Python.
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
Run genmoai/mochi-1-lora-trainer using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-H "Prefer: wait" \
-d $'{
"version": "genmoai/mochi-1-lora-trainer:170ea99fb48a30fef98cb1c9fb403a2882ab9d60c2ba15ad9383ace33c3fa385",
"input": {
"seed": 42,
"steps": 500,
"hf_token": "[REDACTED]",
"optimizer": "adamw",
"batch_size": 1,
"hf_repo_id": "lucataco/mochi-lora-vhs",
"compile_dit": true,
"input_videos": "https://replicate.delivery/pbxt/M7v7tzysZ9DiC0Pdla5rGUDW9tafeTOKzx9iS5fiqhehavkX/vhs-4.zip",
"learning_rate": 0.0004,
"trim_and_crop": true,
"caption_dropout": 0.1
}
}' \
https://api.replicate.com/v1/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
Output
{
"completed_at": "2024-12-11T18:27:26.196131Z",
"created_at": "2024-12-11T17:55:06.813000Z",
"data_removed": false,
"error": null,
"id": "mxey6b9vqnrm80ckpveaqjqb9w",
"input": {
"seed": 42,
"steps": 500,
"hf_token": "[REDACTED]",
"optimizer": "adamw",
"batch_size": 1,
"hf_repo_id": "lucataco/mochi-lora-vhs",
"compile_dit": true,
"input_videos": "https://replicate.delivery/pbxt/M7v7tzysZ9DiC0Pdla5rGUDW9tafeTOKzx9iS5fiqhehavkX/vhs-4.zip",
"learning_rate": 0.0004,
"trim_and_crop": true,
"caption_dropout": 0.1
},
"logs": "Cleaning up previous runs\nExtracted 8 files from zip to videos_input\n---Starting to Trim input videos---\nProcessing: videos_input/vhs1.mp4\nCopied videos_input/vhs1.txt to videos_prepared/vhs1.txt\nMoviepy - Building video videos_prepared/vhs1.mp4.\nMoviepy - Writing video videos_prepared/vhs1.mp4\n 0%| | 0/4 [00:00<?, ?it/s]\n0%| | 0/4 [00:00<?, ?it/s]\n 0%| | 0/4 [00:00<?, ?it/s]\nt: 0%| | 0/40 [00:00<?, ?it/s, now=None]\u001b[A\n \u001b[A\n0%| | 0/4 [00:00<?, ?it/s]\nMoviepy - Done !\nMoviepy - video ready videos_prepared/vhs1.mp4\n 0%| | 0/4 [00:00<?, ?it/s]\nProcessing: videos_input/vhs2.mp4\nCopied videos_input/vhs2.txt to videos_prepared/vhs2.txt\nMoviepy - Building video videos_prepared/vhs2.mp4.\nMoviepy - Writing video videos_prepared/vhs2.mp4\n 25%|██▌ | 1/4 [00:00<00:00, 3.16it/s]\n25%|██▌ | 1/4 [00:00<00:00, 3.16it/s]\n 25%|██▌ | 1/4 [00:00<00:00, 3.16it/s]\nt: 0%| | 0/40 [00:00<?, ?it/s, now=None]\u001b[A\n \u001b[A\n25%|██▌ | 1/4 [00:00<00:00, 3.16it/s]\nMoviepy - Done !\nMoviepy - video ready videos_prepared/vhs2.mp4\n 25%|██▌ | 1/4 [00:00<00:00, 3.16it/s]\nProcessing: videos_input/vhs3.mp4\nCopied videos_input/vhs3.txt to videos_prepared/vhs3.txt\nMoviepy - Building video videos_prepared/vhs3.mp4.\n 50%|█████ | 2/4 [00:00<00:00, 3.05it/s]\n50%|█████ | 2/4 [00:00<00:00, 3.05it/s]\nMoviepy - Writing video videos_prepared/vhs3.mp4\n 50%|█████ | 2/4 [00:00<00:00, 3.05it/s]\nt: 0%| | 0/40 [00:00<?, ?it/s, now=None]\u001b[A\n \u001b[A\n50%|█████ | 2/4 [00:00<00:00, 3.05it/s]\nMoviepy - Done !\nMoviepy - video ready videos_prepared/vhs3.mp4\n 50%|█████ | 2/4 [00:00<00:00, 3.05it/s]\nProcessing: videos_input/vhs4.mp4\nCopied videos_input/vhs4.txt to videos_prepared/vhs4.txt\nMoviepy - Building video videos_prepared/vhs4.mp4.\nMoviepy - Writing video videos_prepared/vhs4.mp4\n 75%|███████▌ | 3/4 [00:00<00:00, 3.05it/s]\n75%|███████▌ | 3/4 [00:01<00:00, 3.05it/s]\n 75%|███████▌ | 3/4 [00:01<00:00, 3.05it/s]\nt: 0%| | 0/40 [00:00<?, ?it/s, now=None]\u001b[A\n \u001b[A\n75%|███████▌ | 3/4 [00:01<00:00, 3.05it/s]\nMoviepy - Done !\nMoviepy - video ready videos_prepared/vhs4.mp4\n 75%|███████▌ | 3/4 [00:01<00:00, 3.05it/s]\n100%|██████████| 4/4 [00:01<00:00, 3.07it/s]\n100%|██████████| 4/4 [00:01<00:00, 3.07it/s]\n---Starting to Embed videos---\nLoading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]\nLoading checkpoint shards: 50%|█████ | 1/2 [00:00<00:00, 1.67it/s]\nLoading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00, 1.78it/s]\nLoading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00, 1.76it/s]\nLoading pipeline components...: 0%| | 0/3 [00:00<?, ?it/s]\nLoading pipeline components...: 100%|██████████| 3/3 [00:00<00:00, 681.59it/s]\nProcessing videos_prepared/vhs1.mp4\nTrimmed video from 40 to first 37 frames\n0it [00:00, ?it/s]\nProcessing videos_prepared/vhs2.mp4\nTrimmed video from 40 to first 37 frames\n1it [00:01, 1.38s/it]\nProcessing videos_prepared/vhs3.mp4\nTrimmed video from 40 to first 37 frames\n2it [00:02, 1.15s/it]\nProcessing videos_prepared/vhs4.mp4\nTrimmed video from 40 to first 37 frames\n3it [00:03, 1.07s/it]\n4it [00:04, 1.03s/it]\n4it [00:04, 1.08s/it]\n---Starting training---\nFound 4 training videos in videos_prepared\nLoaded 4/4 valid file pairs.\n===== Memory before training =====\nmemory_allocated=18.780 GB\nmax_memory_allocated=18.780 GB\nmax_memory_reserved=19.250 GB\n***** Running training *****\nNum trainable parameters = 19005440\nNum examples = 4\nNum batches each epoch = 4\nNum epochs = 125\nInstantaneous batch size per device = 1\nTotal train batch size (w. parallel, distributed & accumulation) = 1\nTotal optimization steps = 500\nSteps: 0%| | 0/500 [00:00<?, ?it/s]W1211 17:57:31.075000 135012609543680 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.\nW1211 17:57:31.089000 135012609543680 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.\nW1211 17:57:31.224000 135012609543680 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.\nSteps: 0%| | 1/500 [04:16<35:33:30, 256.53s/it]\nSteps: 0%| | 1/500 [04:16<35:33:30, 256.53s/it, loss=0.933, lr=2e-6]\nSteps: 0%| | 2/500 [04:18<14:45:44, 106.72s/it, loss=0.933, lr=2e-6]\nSteps: 0%| | 2/500 [04:18<14:45:44, 106.72s/it, loss=1.05, lr=4e-6] \nSteps: 1%| | 3/500 [04:20<8:07:22, 58.84s/it, loss=1.05, lr=4e-6] \nSteps: 1%| | 3/500 [04:20<8:07:22, 58.84s/it, loss=0.864, lr=6e-6]\nSteps: 1%| | 4/500 [04:22<5:00:27, 36.34s/it, loss=0.864, lr=6e-6]\nSteps: 1%| | 4/500 [04:22<5:00:27, 36.34s/it, loss=1.06, lr=8e-6] \nSteps: 1%| | 5/500 [04:29<3:34:15, 25.97s/it, loss=1.06, lr=8e-6]\nSteps: 1%| | 5/500 [04:29<3:34:15, 25.97s/it, loss=0.874, lr=1e-5]\nSteps: 1%| | 6/500 [04:31<2:26:21, 17.78s/it, loss=0.874, lr=1e-5]\nSteps: 1%| | 6/500 [04:31<2:26:21, 17.78s/it, loss=1.02, lr=1.2e-5]\nSteps: 1%|▏ | 7/500 [04:33<1:43:19, 12.58s/it, loss=1.02, lr=1.2e-5]\nSteps: 1%|▏ | 7/500 [04:33<1:43:19, 12.58s/it, loss=0.902, lr=1.4e-5]\nSteps: 2%|▏ | 8/500 [04:35<1:15:09, 9.17s/it, loss=0.902, lr=1.4e-5]\nSteps: 2%|▏ | 8/500 [04:35<1:15:09, 9.17s/it, loss=1.08, lr=1.6e-5] \nSteps: 2%|▏ | 9/500 [04:42<1:11:01, 8.68s/it, loss=1.08, lr=1.6e-5]\nSteps: 2%|▏ | 9/500 [04:42<1:11:01, 8.68s/it, loss=1.04, lr=1.8e-5]\nSteps: 2%|▏ | 10/500 [04:44<53:42, 6.58s/it, loss=1.04, lr=1.8e-5] \nSteps: 2%|▏ | 10/500 [04:44<53:42, 6.58s/it, loss=1.09, lr=2e-5] \nSteps: 2%|▏ | 11/500 [04:46<41:50, 5.13s/it, loss=1.09, lr=2e-5]\nSteps: 2%|▏ | 11/500 [04:46<41:50, 5.13s/it, loss=0.886, lr=2.2e-5]\nSteps: 2%|▏ | 12/500 [04:48<33:40, 4.14s/it, loss=0.886, lr=2.2e-5]\nSteps: 2%|▏ | 12/500 [04:48<33:40, 4.14s/it, loss=1.1, lr=2.4e-5] \nSteps: 3%|▎ | 13/500 [04:56<42:08, 5.19s/it, loss=1.1, lr=2.4e-5]\nSteps: 3%|▎ | 13/500 [04:56<42:08, 5.19s/it, loss=0.881, lr=2.6e-5]\nSteps: 3%|▎ | 14/500 [04:57<33:55, 4.19s/it, loss=0.881, lr=2.6e-5]\nSteps: 3%|▎ | 14/500 [04:57<33:55, 4.19s/it, loss=1.07, lr=2.8e-5] \nSteps: 3%|▎ | 15/500 [04:59<28:12, 3.49s/it, loss=1.07, lr=2.8e-5]\nSteps: 3%|▎ | 15/500 [04:59<28:12, 3.49s/it, loss=0.79, lr=3e-5] \nSteps: 3%|▎ | 16/500 [05:01<24:13, 3.00s/it, loss=0.79, lr=3e-5]\nSteps: 3%|▎ | 16/500 [05:01<24:13, 3.00s/it, loss=1.07, lr=3.2e-5]\nSteps: 3%|▎ | 17/500 [05:09<35:16, 4.38s/it, loss=1.07, lr=3.2e-5]\nSteps: 3%|▎ | 17/500 [05:09<35:16, 4.38s/it, loss=0.873, lr=3.4e-5]\nSteps: 4%|▎ | 18/500 [05:11<29:08, 3.63s/it, loss=0.873, lr=3.4e-5]\nSteps: 4%|▎ | 18/500 [05:11<29:08, 3.63s/it, loss=0.968, lr=3.6e-5]\nSteps: 4%|▍ | 19/500 [05:13<24:50, 3.10s/it, loss=0.968, lr=3.6e-5]\nSteps: 4%|▍ | 19/500 [05:13<24:50, 3.10s/it, loss=0.979, lr=3.8e-5]\nSteps: 4%|▍ | 20/500 [05:14<21:50, 2.73s/it, loss=0.979, lr=3.8e-5]\nSteps: 4%|▍ | 20/500 [05:14<21:50, 2.73s/it, loss=1.08, lr=4e-5] \nSteps: 4%|▍ | 21/500 [05:22<33:40, 4.22s/it, loss=1.08, lr=4e-5]\nSteps: 4%|▍ | 21/500 [05:22<33:40, 4.22s/it, loss=0.866, lr=4.2e-5]\nSteps: 4%|▍ | 22/500 [05:24<27:59, 3.51s/it, loss=0.866, lr=4.2e-5]\nSteps: 4%|▍ | 22/500 [05:24<27:59, 3.51s/it, loss=0.966, lr=4.4e-5]\nSteps: 5%|▍ | 23/500 [05:26<24:01, 3.02s/it, loss=0.966, lr=4.4e-5]\nSteps: 5%|▍ | 23/500 [05:26<24:01, 3.02s/it, loss=0.849, lr=4.6e-5]\nSteps: 5%|▍ | 24/500 [05:28<21:13, 2.68s/it, loss=0.849, lr=4.6e-5]\nSteps: 5%|▍ | 24/500 [05:28<21:13, 2.68s/it, loss=1.07, lr=4.8e-5] \nSteps: 5%|▌ | 25/500 [05:35<33:08, 4.19s/it, loss=1.07, lr=4.8e-5]\nSteps: 5%|▌ | 25/500 [05:35<33:08, 4.19s/it, loss=0.853, lr=5e-5] \nSteps: 5%|▌ | 26/500 [05:37<27:34, 3.49s/it, loss=0.853, lr=5e-5]\nSteps: 5%|▌ | 26/500 [05:37<27:34, 3.49s/it, loss=0.996, lr=5.2e-5]\nSteps: 5%|▌ | 27/500 [05:39<23:40, 3.00s/it, loss=0.996, lr=5.2e-5]\nSteps: 5%|▌ | 27/500 [05:39<23:40, 3.00s/it, loss=0.879, lr=5.4e-5]\nSteps: 6%|▌ | 28/500 [05:41<20:56, 2.66s/it, loss=0.879, lr=5.4e-5]\nSteps: 6%|▌ | 28/500 [05:41<20:56, 2.66s/it, loss=0.977, lr=5.6e-5]\nSteps: 6%|▌ | 29/500 [05:49<32:28, 4.14s/it, loss=0.977, lr=5.6e-5]\nSteps: 6%|▌ | 29/500 [05:49<32:28, 4.14s/it, loss=0.881, lr=5.8e-5]\nSteps: 6%|▌ | 30/500 [05:50<27:04, 3.46s/it, loss=0.881, lr=5.8e-5]\nSteps: 6%|▌ | 30/500 [05:50<27:04, 3.46s/it, loss=1.06, lr=6e-5] \nSteps: 6%|▌ | 31/500 [05:52<23:18, 2.98s/it, loss=1.06, lr=6e-5]\nSteps: 6%|▌ | 31/500 [05:52<23:18, 2.98s/it, loss=1.05, lr=6.2e-5]\nSteps: 6%|▋ | 32/500 [05:54<20:39, 2.65s/it, loss=1.05, lr=6.2e-5]\nSteps: 6%|▋ | 32/500 [05:54<20:39, 2.65s/it, loss=0.985, lr=6.4e-5]\nSteps: 7%|▋ | 33/500 [06:02<32:21, 4.16s/it, loss=0.985, lr=6.4e-5]\nSteps: 7%|▋ | 33/500 [06:02<32:21, 4.16s/it, loss=0.871, lr=6.6e-5]\nSteps: 7%|▋ | 34/500 [06:04<26:57, 3.47s/it, loss=0.871, lr=6.6e-5]\nSteps: 7%|▋ | 34/500 [06:04<26:57, 3.47s/it, loss=1.04, lr=6.8e-5] \nSteps: 7%|▋ | 35/500 [06:06<23:10, 2.99s/it, loss=1.04, lr=6.8e-5]\nSteps: 7%|▋ | 35/500 [06:06<23:10, 2.99s/it, loss=0.829, lr=7e-5] \nSteps: 7%|▋ | 36/500 [06:08<20:32, 2.66s/it, loss=0.829, lr=7e-5]\nSteps: 7%|▋ | 36/500 [06:08<20:32, 2.66s/it, loss=0.963, lr=7.2e-5]\nSteps: 7%|▋ | 37/500 [06:15<32:12, 4.17s/it, loss=0.963, lr=7.2e-5]\nSteps: 7%|▋ | 37/500 [06:15<32:12, 4.17s/it, loss=0.878, lr=7.4e-5]\nSteps: 8%|▊ | 38/500 [06:17<26:49, 3.48s/it, loss=0.878, lr=7.4e-5]\nSteps: 8%|▊ | 38/500 [06:17<26:49, 3.48s/it, loss=1.03, lr=7.6e-5] \nSteps: 8%|▊ | 39/500 [06:19<23:02, 3.00s/it, loss=1.03, lr=7.6e-5]\nSteps: 8%|▊ | 39/500 [06:19<23:02, 3.00s/it, loss=0.886, lr=7.8e-5]\nSteps: 8%|▊ | 40/500 [06:21<20:24, 2.66s/it, loss=0.886, lr=7.8e-5]\nSteps: 8%|▊ | 40/500 [06:21<20:24, 2.66s/it, loss=1.06, lr=8e-5] \nSteps: 8%|▊ | 41/500 [06:28<31:38, 4.14s/it, loss=1.06, lr=8e-5]\nSteps: 8%|▊ | 41/500 [06:28<31:38, 4.14s/it, loss=0.874, lr=8.2e-5]\nSteps: 8%|▊ | 42/500 [06:30<26:23, 3.46s/it, loss=0.874, lr=8.2e-5]\nSteps: 8%|▊ | 42/500 [06:30<26:23, 3.46s/it, loss=1.07, lr=8.4e-5] \nSteps: 9%|▊ | 43/500 [06:32<22:43, 2.98s/it, loss=1.07, lr=8.4e-5]\nSteps: 9%|▊ | 43/500 [06:32<22:43, 2.98s/it, loss=0.911, lr=8.6e-5]\nSteps: 9%|▉ | 44/500 [06:34<20:07, 2.65s/it, loss=0.911, lr=8.6e-5]\nSteps: 9%|▉ | 44/500 [06:34<20:07, 2.65s/it, loss=1.05, lr=8.8e-5] \nSteps: 9%|▉ | 45/500 [06:42<31:30, 4.15s/it, loss=1.05, lr=8.8e-5]\nSteps: 9%|▉ | 45/500 [06:42<31:30, 4.15s/it, loss=0.874, lr=9e-5] \nSteps: 9%|▉ | 46/500 [06:44<26:15, 3.47s/it, loss=0.874, lr=9e-5]\nSteps: 9%|▉ | 46/500 [06:44<26:15, 3.47s/it, loss=1.06, lr=9.2e-5]\nSteps: 9%|▉ | 47/500 [06:45<22:34, 2.99s/it, loss=1.06, lr=9.2e-5]\nSteps: 9%|▉ | 47/500 [06:45<22:34, 2.99s/it, loss=0.833, lr=9.4e-5]\nSteps: 10%|▉ | 48/500 [06:47<20:00, 2.66s/it, loss=0.833, lr=9.4e-5]\nSteps: 10%|▉ | 48/500 [06:47<20:00, 2.66s/it, loss=0.973, lr=9.6e-5]\nSteps: 10%|▉ | 49/500 [06:55<31:31, 4.19s/it, loss=0.973, lr=9.6e-5]\nSteps: 10%|▉ | 49/500 [06:55<31:31, 4.19s/it, loss=0.883, lr=9.8e-5]\nSteps: 10%|█ | 50/500 [06:57<26:13, 3.50s/it, loss=0.883, lr=9.8e-5]\nSteps: 10%|█ | 50/500 [06:57<26:13, 3.50s/it, loss=1.08, lr=0.0001] \nSteps: 10%|█ | 51/500 [06:59<22:31, 3.01s/it, loss=1.08, lr=0.0001]\nSteps: 10%|█ | 51/500 [06:59<22:31, 3.01s/it, loss=0.826, lr=0.000102]\nSteps: 10%|█ | 52/500 [07:01<19:56, 2.67s/it, loss=0.826, lr=0.000102]\nSteps: 10%|█ | 52/500 [07:01<19:56, 2.67s/it, loss=0.939, lr=0.000104]\nSteps: 11%|█ | 53/500 [07:11<37:37, 5.05s/it, loss=0.939, lr=0.000104]\nSteps: 11%|█ | 53/500 [07:11<37:37, 5.05s/it, loss=0.789, lr=0.000106]\nSteps: 11%|█ | 54/500 [07:13<30:27, 4.10s/it, loss=0.789, lr=0.000106]\nSteps: 11%|█ | 54/500 [07:13<30:27, 4.10s/it, loss=1.05, lr=0.000108] \nSteps: 11%|█ | 55/500 [07:15<25:25, 3.43s/it, loss=1.05, lr=0.000108]\nSteps: 11%|█ | 55/500 [07:15<25:25, 3.43s/it, loss=1.05, lr=0.00011] \nSteps: 11%|█ | 56/500 [07:17<21:55, 2.96s/it, loss=1.05, lr=0.00011]\nSteps: 11%|█ | 56/500 [07:17<21:55, 2.96s/it, loss=0.958, lr=0.000112]\nSteps: 11%|█▏ | 57/500 [07:24<31:59, 4.33s/it, loss=0.958, lr=0.000112]\nSteps: 11%|█▏ | 57/500 [07:24<31:59, 4.33s/it, loss=0.842, lr=0.000114]\nSteps: 12%|█▏ | 58/500 [07:26<26:28, 3.59s/it, loss=0.842, lr=0.000114]\nSteps: 12%|█▏ | 58/500 [07:26<26:28, 3.59s/it, loss=0.939, lr=0.000116]\nSteps: 12%|█▏ | 59/500 [07:28<22:36, 3.08s/it, loss=0.939, lr=0.000116]\nSteps: 12%|█▏ | 59/500 [07:28<22:36, 3.08s/it, loss=0.882, lr=0.000118]\nSteps: 12%|█▏ | 60/500 [07:30<19:54, 2.71s/it, loss=0.882, lr=0.000118]\nSteps: 12%|█▏ | 60/500 [07:30<19:54, 2.71s/it, loss=0.952, lr=0.00012] \nSteps: 12%|█▏ | 61/500 [07:38<30:27, 4.16s/it, loss=0.952, lr=0.00012]\nSteps: 12%|█▏ | 61/500 [07:38<30:27, 4.16s/it, loss=1.05, lr=0.000122]\nSteps: 12%|█▏ | 62/500 [07:40<25:22, 3.48s/it, loss=1.05, lr=0.000122]\nSteps: 12%|█▏ | 62/500 [07:40<25:22, 3.48s/it, loss=0.985, lr=0.000124]\nSteps: 13%|█▎ | 63/500 [07:41<21:48, 3.00s/it, loss=0.985, lr=0.000124]\nSteps: 13%|█▎ | 63/500 [07:41<21:48, 3.00s/it, loss=0.816, lr=0.000126]\nSteps: 13%|█▎ | 64/500 [07:43<19:18, 2.66s/it, loss=0.816, lr=0.000126]\nSteps: 13%|█▎ | 64/500 [07:43<19:18, 2.66s/it, loss=1.02, lr=0.000128] \nSteps: 13%|█▎ | 65/500 [07:51<30:04, 4.15s/it, loss=1.02, lr=0.000128]\nSteps: 13%|█▎ | 65/500 [07:51<30:04, 4.15s/it, loss=0.855, lr=0.00013]\nSteps: 13%|█▎ | 66/500 [07:53<25:03, 3.47s/it, loss=0.855, lr=0.00013]\nSteps: 13%|█▎ | 66/500 [07:53<25:03, 3.47s/it, loss=0.947, lr=0.000132]\nSteps: 13%|█▎ | 67/500 [07:55<21:33, 2.99s/it, loss=0.947, lr=0.000132]\nSteps: 13%|█▎ | 67/500 [07:55<21:33, 2.99s/it, loss=0.879, lr=0.000134]\nSteps: 14%|█▎ | 68/500 [07:56<19:05, 2.65s/it, loss=0.879, lr=0.000134]\nSteps: 14%|█▎ | 68/500 [07:57<19:05, 2.65s/it, loss=1.06, lr=0.000136] \nSteps: 14%|█▍ | 69/500 [08:04<30:06, 4.19s/it, loss=1.06, lr=0.000136]\nSteps: 14%|█▍ | 69/500 [08:04<30:06, 4.19s/it, loss=0.825, lr=0.000138]\nSteps: 14%|█▍ | 70/500 [08:06<25:02, 3.50s/it, loss=0.825, lr=0.000138]\nSteps: 14%|█▍ | 70/500 [08:06<25:02, 3.50s/it, loss=0.924, lr=0.00014] \nSteps: 14%|█▍ | 71/500 [08:08<21:30, 3.01s/it, loss=0.924, lr=0.00014]\nSteps: 14%|█▍ | 71/500 [08:08<21:30, 3.01s/it, loss=0.794, lr=0.000142]\nSteps: 14%|█▍ | 72/500 [08:10<19:01, 2.67s/it, loss=0.794, lr=0.000142]\nSteps: 14%|█▍ | 72/500 [08:10<19:01, 2.67s/it, loss=0.978, lr=0.000144]\nSteps: 15%|█▍ | 73/500 [08:18<29:45, 4.18s/it, loss=0.978, lr=0.000144]\nSteps: 15%|█▍ | 73/500 [08:18<29:45, 4.18s/it, loss=0.996, lr=0.000146]\nSteps: 15%|█▍ | 74/500 [08:19<24:46, 3.49s/it, loss=0.996, lr=0.000146]\nSteps: 15%|█▍ | 74/500 [08:19<24:46, 3.49s/it, loss=1.07, lr=0.000148] \nSteps: 15%|█▌ | 75/500 [08:21<21:16, 3.00s/it, loss=1.07, lr=0.000148]\nSteps: 15%|█▌ | 75/500 [08:21<21:16, 3.00s/it, loss=0.842, lr=0.00015]\nSteps: 15%|█▌ | 76/500 [08:23<18:50, 2.67s/it, loss=0.842, lr=0.00015]\nSteps: 15%|█▌ | 76/500 [08:23<18:50, 2.67s/it, loss=0.946, lr=0.000152]\nSteps: 15%|█▌ | 77/500 [08:31<29:24, 4.17s/it, loss=0.946, lr=0.000152]\nSteps: 15%|█▌ | 77/500 [08:31<29:24, 4.17s/it, loss=0.838, lr=0.000154]\nSteps: 16%|█▌ | 78/500 [08:33<24:29, 3.48s/it, loss=0.838, lr=0.000154]\nSteps: 16%|█▌ | 78/500 [08:33<24:29, 3.48s/it, loss=1.06, lr=0.000156] \nSteps: 16%|█▌ | 79/500 [08:35<21:02, 3.00s/it, loss=1.06, lr=0.000156]\nSteps: 16%|█▌ | 79/500 [08:35<21:02, 3.00s/it, loss=0.85, lr=0.000158]\nSteps: 16%|█▌ | 80/500 [08:37<18:37, 2.66s/it, loss=0.85, lr=0.000158]\nSteps: 16%|█▌ | 80/500 [08:37<18:37, 2.66s/it, loss=0.923, lr=0.00016]\nSteps: 16%|█▌ | 81/500 [08:44<29:10, 4.18s/it, loss=0.923, lr=0.00016]\nSteps: 16%|█▌ | 81/500 [08:44<29:10, 4.18s/it, loss=0.764, lr=0.000162]\nSteps: 16%|█▋ | 82/500 [08:46<24:17, 3.49s/it, loss=0.764, lr=0.000162]\nSteps: 16%|█▋ | 82/500 [08:46<24:17, 3.49s/it, loss=0.94, lr=0.000164] \nSteps: 17%|█▋ | 83/500 [08:48<20:51, 3.00s/it, loss=0.94, lr=0.000164]\nSteps: 17%|█▋ | 83/500 [08:48<20:51, 3.00s/it, loss=0.828, lr=0.000166]\nSteps: 17%|█▋ | 84/500 [08:50<18:27, 2.66s/it, loss=0.828, lr=0.000166]\nSteps: 17%|█▋ | 84/500 [08:50<18:27, 2.66s/it, loss=1.02, lr=0.000168] \nSteps: 17%|█▋ | 85/500 [08:58<28:59, 4.19s/it, loss=1.02, lr=0.000168]\nSteps: 17%|█▋ | 85/500 [08:58<28:59, 4.19s/it, loss=0.991, lr=0.00017]\nSteps: 17%|█▋ | 86/500 [08:59<24:07, 3.50s/it, loss=0.991, lr=0.00017]\nSteps: 17%|█▋ | 86/500 [09:00<24:07, 3.50s/it, loss=0.975, lr=0.000172]\nSteps: 17%|█▋ | 87/500 [09:01<20:42, 3.01s/it, loss=0.975, lr=0.000172]\nSteps: 17%|█▋ | 87/500 [09:01<20:42, 3.01s/it, loss=0.814, lr=0.000174]\nSteps: 18%|█▊ | 88/500 [09:03<18:19, 2.67s/it, loss=0.814, lr=0.000174]\nSteps: 18%|█▊ | 88/500 [09:03<18:19, 2.67s/it, loss=1.07, lr=0.000176] \nSteps: 18%|█▊ | 89/500 [09:11<28:25, 4.15s/it, loss=1.07, lr=0.000176]\nSteps: 18%|█▊ | 89/500 [09:11<28:25, 4.15s/it, loss=0.859, lr=0.000178]\nSteps: 18%|█▊ | 90/500 [09:13<23:41, 3.47s/it, loss=0.859, lr=0.000178]\nSteps: 18%|█▊ | 90/500 [09:13<23:41, 3.47s/it, loss=1.06, lr=0.00018] \nSteps: 18%|█▊ | 91/500 [09:15<20:21, 2.99s/it, loss=1.06, lr=0.00018]\nSteps: 18%|█▊ | 91/500 [09:15<20:21, 2.99s/it, loss=0.825, lr=0.000182]\nSteps: 18%|█▊ | 92/500 [09:16<18:02, 2.65s/it, loss=0.825, lr=0.000182]\nSteps: 18%|█▊ | 92/500 [09:16<18:02, 2.65s/it, loss=0.954, lr=0.000184]\nSteps: 19%|█▊ | 93/500 [09:24<28:00, 4.13s/it, loss=0.954, lr=0.000184]\nSteps: 19%|█▊ | 93/500 [09:24<28:00, 4.13s/it, loss=0.852, lr=0.000186]\nSteps: 19%|█▉ | 94/500 [09:26<23:21, 3.45s/it, loss=0.852, lr=0.000186]\nSteps: 19%|█▉ | 94/500 [09:26<23:21, 3.45s/it, loss=1.04, lr=0.000188] \nSteps: 19%|█▉ | 95/500 [09:28<20:06, 2.98s/it, loss=1.04, lr=0.000188]\nSteps: 19%|█▉ | 95/500 [09:28<20:06, 2.98s/it, loss=0.847, lr=0.00019]\nSteps: 19%|█▉ | 96/500 [09:30<17:49, 2.65s/it, loss=0.847, lr=0.00019]\nSteps: 19%|█▉ | 96/500 [09:30<17:49, 2.65s/it, loss=0.921, lr=0.000192]\nSteps: 19%|█▉ | 97/500 [09:37<27:56, 4.16s/it, loss=0.921, lr=0.000192]\nSteps: 19%|█▉ | 97/500 [09:37<27:56, 4.16s/it, loss=0.873, lr=0.000194]\nSteps: 20%|█▉ | 98/500 [09:39<23:16, 3.47s/it, loss=0.873, lr=0.000194]\nSteps: 20%|█▉ | 98/500 [09:39<23:16, 3.47s/it, loss=0.977, lr=0.000196]\nSteps: 20%|█▉ | 99/500 [09:41<20:00, 2.99s/it, loss=0.977, lr=0.000196]\nSteps: 20%|█▉ | 99/500 [09:41<20:00, 2.99s/it, loss=0.851, lr=0.000198]\nSteps: 20%|██ | 100/500 [09:43<17:44, 2.66s/it, loss=0.851, lr=0.000198]\nSteps: 20%|██ | 100/500 [09:43<17:44, 2.66s/it, loss=0.918, lr=0.0002] \nSteps: 20%|██ | 101/500 [09:51<28:24, 4.27s/it, loss=0.918, lr=0.0002]\nSteps: 20%|██ | 101/500 [09:51<28:24, 4.27s/it, loss=0.809, lr=0.000202]\nSteps: 20%|██ | 102/500 [09:53<23:33, 3.55s/it, loss=0.809, lr=0.000202]\nSteps: 20%|██ | 102/500 [09:53<23:33, 3.55s/it, loss=0.916, lr=0.000204]\nSteps: 21%|██ | 103/500 [09:55<20:10, 3.05s/it, loss=0.916, lr=0.000204]\nSteps: 21%|██ | 103/500 [09:55<20:10, 3.05s/it, loss=1.01, lr=0.000206] \nSteps: 21%|██ | 104/500 [09:57<17:48, 2.70s/it, loss=1.01, lr=0.000206]\nSteps: 21%|██ | 104/500 [09:57<17:48, 2.70s/it, loss=0.958, lr=0.000208]\nSteps: 21%|██ | 105/500 [10:05<28:03, 4.26s/it, loss=0.958, lr=0.000208]\nSteps: 21%|██ | 105/500 [10:05<28:03, 4.26s/it, loss=0.807, lr=0.00021] \nSteps: 21%|██ | 106/500 [10:06<23:16, 3.55s/it, loss=0.807, lr=0.00021]\nSteps: 21%|██ | 106/500 [10:06<23:16, 3.55s/it, loss=0.953, lr=0.000212]\nSteps: 21%|██▏ | 107/500 [10:08<19:56, 3.04s/it, loss=0.953, lr=0.000212]\nSteps: 21%|██▏ | 107/500 [10:08<19:56, 3.04s/it, loss=0.826, lr=0.000214]\nSteps: 22%|██▏ | 108/500 [10:10<17:35, 2.69s/it, loss=0.826, lr=0.000214]\nSteps: 22%|██▏ | 108/500 [10:10<17:35, 2.69s/it, loss=1.08, lr=0.000216] \nSteps: 22%|██▏ | 109/500 [10:18<27:22, 4.20s/it, loss=1.08, lr=0.000216]\nSteps: 22%|██▏ | 109/500 [10:18<27:22, 4.20s/it, loss=0.836, lr=0.000218]\nSteps: 22%|██▏ | 110/500 [10:20<22:46, 3.50s/it, loss=0.836, lr=0.000218]\nSteps: 22%|██▏ | 110/500 [10:20<22:46, 3.50s/it, loss=1.07, lr=0.00022] \nSteps: 22%|██▏ | 111/500 [10:22<19:32, 3.01s/it, loss=1.07, lr=0.00022]\nSteps: 22%|██▏ | 111/500 [10:22<19:32, 3.01s/it, loss=0.824, lr=0.000222]\nSteps: 22%|██▏ | 112/500 [10:24<17:16, 2.67s/it, loss=0.824, lr=0.000222]\nSteps: 22%|██▏ | 112/500 [10:24<17:16, 2.67s/it, loss=0.916, lr=0.000224]\nSteps: 23%|██▎ | 113/500 [10:31<26:58, 4.18s/it, loss=0.916, lr=0.000224]\nSteps: 23%|██▎ | 113/500 [10:31<26:58, 4.18s/it, loss=0.793, lr=0.000226]\nSteps: 23%|██▎ | 114/500 [10:33<22:27, 3.49s/it, loss=0.793, lr=0.000226]\nSteps: 23%|██▎ | 114/500 [10:33<22:27, 3.49s/it, loss=0.927, lr=0.000228]\nSteps: 23%|██▎ | 115/500 [10:35<19:17, 3.01s/it, loss=0.927, lr=0.000228]\nSteps: 23%|██▎ | 115/500 [10:35<19:17, 3.01s/it, loss=0.924, lr=0.00023] \nSteps: 23%|██▎ | 116/500 [10:37<17:03, 2.67s/it, loss=0.924, lr=0.00023]\nSteps: 23%|██▎ | 116/500 [10:37<17:03, 2.67s/it, loss=1.04, lr=0.000232]\nSteps: 23%|██▎ | 117/500 [10:44<26:32, 4.16s/it, loss=1.04, lr=0.000232]\nSteps: 23%|██▎ | 117/500 [10:44<26:32, 4.16s/it, loss=0.857, lr=0.000234]\nSteps: 24%|██▎ | 118/500 [10:46<22:06, 3.47s/it, loss=0.857, lr=0.000234]\nSteps: 24%|██▎ | 118/500 [10:46<22:06, 3.47s/it, loss=0.91, lr=0.000236] \nSteps: 24%|██▍ | 119/500 [10:48<19:00, 2.99s/it, loss=0.91, lr=0.000236]\nSteps: 24%|██▍ | 119/500 [10:48<19:00, 2.99s/it, loss=0.781, lr=0.000238]\nSteps: 24%|██▍ | 120/500 [10:50<16:49, 2.66s/it, loss=0.781, lr=0.000238]\nSteps: 24%|██▍ | 120/500 [10:50<16:49, 2.66s/it, loss=0.937, lr=0.00024] \nSteps: 24%|██▍ | 121/500 [10:58<26:42, 4.23s/it, loss=0.937, lr=0.00024]\nSteps: 24%|██▍ | 121/500 [10:58<26:42, 4.23s/it, loss=0.876, lr=0.000242]\nSteps: 24%|██▍ | 122/500 [11:00<22:10, 3.52s/it, loss=0.876, lr=0.000242]\nSteps: 24%|██▍ | 122/500 [11:00<22:10, 3.52s/it, loss=0.971, lr=0.000244]\nSteps: 25%|██▍ | 123/500 [11:02<19:00, 3.03s/it, loss=0.971, lr=0.000244]\nSteps: 25%|██▍ | 123/500 [11:02<19:00, 3.03s/it, loss=0.812, lr=0.000246]\nSteps: 25%|██▍ | 124/500 [11:04<16:47, 2.68s/it, loss=0.812, lr=0.000246]\nSteps: 25%|██▍ | 124/500 [11:04<16:47, 2.68s/it, loss=1, lr=0.000248] \nSteps: 25%|██▌ | 125/500 [11:11<26:07, 4.18s/it, loss=1, lr=0.000248]\nSteps: 25%|██▌ | 125/500 [11:11<26:07, 4.18s/it, loss=0.97, lr=0.00025]\nSteps: 25%|██▌ | 126/500 [11:13<21:44, 3.49s/it, loss=0.97, lr=0.00025]\nSteps: 25%|██▌ | 126/500 [11:13<21:44, 3.49s/it, loss=1.07, lr=0.000252]\nSteps: 25%|██▌ | 127/500 [11:15<18:40, 3.00s/it, loss=1.07, lr=0.000252]\nSteps: 25%|██▌ | 127/500 [11:15<18:40, 3.00s/it, loss=0.814, lr=0.000254]\nSteps: 26%|██▌ | 128/500 [11:17<16:31, 2.67s/it, loss=0.814, lr=0.000254]\nSteps: 26%|██▌ | 128/500 [11:17<16:31, 2.67s/it, loss=0.904, lr=0.000256]\nSteps: 26%|██▌ | 129/500 [11:25<25:55, 4.19s/it, loss=0.904, lr=0.000256]\nSteps: 26%|██▌ | 129/500 [11:25<25:55, 4.19s/it, loss=0.885, lr=0.000258]\nSteps: 26%|██▌ | 130/500 [11:27<21:33, 3.50s/it, loss=0.885, lr=0.000258]\nSteps: 26%|██▌ | 130/500 [11:27<21:33, 3.50s/it, loss=0.923, lr=0.00026] \nSteps: 26%|██▌ | 131/500 [11:28<18:30, 3.01s/it, loss=0.923, lr=0.00026]\nSteps: 26%|██▌ | 131/500 [11:28<18:30, 3.01s/it, loss=0.812, lr=0.000262]\nSteps: 26%|██▋ | 132/500 [11:30<16:21, 2.67s/it, loss=0.812, lr=0.000262]\nSteps: 26%|██▋ | 132/500 [11:30<16:21, 2.67s/it, loss=0.986, lr=0.000264]\nSteps: 27%|██▋ | 133/500 [11:38<25:53, 4.23s/it, loss=0.986, lr=0.000264]\nSteps: 27%|██▋ | 133/500 [11:38<25:53, 4.23s/it, loss=0.823, lr=0.000266]\nSteps: 27%|██▋ | 134/500 [11:40<21:31, 3.53s/it, loss=0.823, lr=0.000266]\nSteps: 27%|██▋ | 134/500 [11:40<21:31, 3.53s/it, loss=1.06, lr=0.000268] \nSteps: 27%|██▋ | 135/500 [11:42<18:26, 3.03s/it, loss=1.06, lr=0.000268]\nSteps: 27%|██▋ | 135/500 [11:42<18:26, 3.03s/it, loss=1.07, lr=0.00027] \nSteps: 27%|██▋ | 136/500 [11:44<16:17, 2.69s/it, loss=1.07, lr=0.00027]\nSteps: 27%|██▋ | 136/500 [11:44<16:17, 2.69s/it, loss=0.961, lr=0.000272]\nSteps: 27%|██▋ | 137/500 [11:51<25:19, 4.19s/it, loss=0.961, lr=0.000272]\nSteps: 27%|██▋ | 137/500 [11:52<25:19, 4.19s/it, loss=0.842, lr=0.000274]\nSteps: 28%|██▊ | 138/500 [11:53<21:04, 3.49s/it, loss=0.842, lr=0.000274]\nSteps: 28%|██▊ | 138/500 [11:53<21:04, 3.49s/it, loss=0.952, lr=0.000276]\nSteps: 28%|██▊ | 139/500 [11:55<18:05, 3.01s/it, loss=0.952, lr=0.000276]\nSteps: 28%|██▊ | 139/500 [11:55<18:05, 3.01s/it, loss=0.901, lr=0.000278]\nSteps: 28%|██▊ | 140/500 [11:57<16:00, 2.67s/it, loss=0.901, lr=0.000278]\nSteps: 28%|██▊ | 140/500 [11:57<16:00, 2.67s/it, loss=0.926, lr=0.00028] \nSteps: 28%|██▊ | 141/500 [12:05<25:06, 4.20s/it, loss=0.926, lr=0.00028]\nSteps: 28%|██▊ | 141/500 [12:05<25:06, 4.20s/it, loss=0.808, lr=0.000282]\nSteps: 28%|██▊ | 142/500 [12:07<20:52, 3.50s/it, loss=0.808, lr=0.000282]\nSteps: 28%|██▊ | 142/500 [12:07<20:52, 3.50s/it, loss=0.926, lr=0.000284]\nSteps: 29%|██▊ | 143/500 [12:09<17:55, 3.01s/it, loss=0.926, lr=0.000284]\nSteps: 29%|██▊ | 143/500 [12:09<17:55, 3.01s/it, loss=0.951, lr=0.000286]\nSteps: 29%|██▉ | 144/500 [12:11<15:51, 2.67s/it, loss=0.951, lr=0.000286]\nSteps: 29%|██▉ | 144/500 [12:11<15:51, 2.67s/it, loss=0.911, lr=0.000288]\nSteps: 29%|██▉ | 145/500 [12:18<24:43, 4.18s/it, loss=0.911, lr=0.000288]\nSteps: 29%|██▉ | 145/500 [12:18<24:43, 4.18s/it, loss=0.806, lr=0.00029] \nSteps: 29%|██▉ | 146/500 [12:20<20:34, 3.49s/it, loss=0.806, lr=0.00029]\nSteps: 29%|██▉ | 146/500 [12:20<20:34, 3.49s/it, loss=0.901, lr=0.000292]\nSteps: 29%|██▉ | 147/500 [12:22<17:40, 3.00s/it, loss=0.901, lr=0.000292]\nSteps: 29%|██▉ | 147/500 [12:22<17:40, 3.00s/it, loss=0.847, lr=0.000294]\nSteps: 30%|██▉ | 148/500 [12:24<15:38, 2.67s/it, loss=0.847, lr=0.000294]\nSteps: 30%|██▉ | 148/500 [12:24<15:38, 2.67s/it, loss=0.963, lr=0.000296]\nSteps: 30%|██▉ | 149/500 [12:32<24:33, 4.20s/it, loss=0.963, lr=0.000296]\nSteps: 30%|██▉ | 149/500 [12:32<24:33, 4.20s/it, loss=1, lr=0.000298] \nSteps: 30%|███ | 150/500 [12:33<20:24, 3.50s/it, loss=1, lr=0.000298]\nSteps: 30%|███ | 150/500 [12:33<20:24, 3.50s/it, loss=0.897, lr=0.0003]\nSteps: 30%|███ | 151/500 [12:35<17:30, 3.01s/it, loss=0.897, lr=0.0003]\nSteps: 30%|███ | 151/500 [12:35<17:30, 3.01s/it, loss=0.842, lr=0.000302]\nSteps: 30%|███ | 152/500 [12:37<15:28, 2.67s/it, loss=0.842, lr=0.000302]\nSteps: 30%|███ | 152/500 [12:37<15:28, 2.67s/it, loss=1.07, lr=0.000304] \nSteps: 31%|███ | 153/500 [12:45<24:13, 4.19s/it, loss=1.07, lr=0.000304]\nSteps: 31%|███ | 153/500 [12:45<24:13, 4.19s/it, loss=0.861, lr=0.000306]\nSteps: 31%|███ | 154/500 [12:47<20:09, 3.49s/it, loss=0.861, lr=0.000306]\nSteps: 31%|███ | 154/500 [12:47<20:09, 3.49s/it, loss=0.903, lr=0.000308]\nSteps: 31%|███ | 155/500 [12:49<17:17, 3.01s/it, loss=0.903, lr=0.000308]\nSteps: 31%|███ | 155/500 [12:49<17:17, 3.01s/it, loss=0.86, lr=0.00031] \nSteps: 31%|███ | 156/500 [12:51<15:18, 2.67s/it, loss=0.86, lr=0.00031]\nSteps: 31%|███ | 156/500 [12:51<15:18, 2.67s/it, loss=0.904, lr=0.000312]\nSteps: 31%|███▏ | 157/500 [12:58<24:02, 4.21s/it, loss=0.904, lr=0.000312]\nSteps: 31%|███▏ | 157/500 [12:58<24:02, 4.21s/it, loss=1.05, lr=0.000314] \nSteps: 32%|███▏ | 158/500 [13:00<19:58, 3.51s/it, loss=1.05, lr=0.000314]\nSteps: 32%|███▏ | 158/500 [13:00<19:58, 3.51s/it, loss=1.02, lr=0.000316]\nSteps: 32%|███▏ | 159/500 [13:02<17:08, 3.02s/it, loss=1.02, lr=0.000316]\nSteps: 32%|███▏ | 159/500 [13:02<17:08, 3.02s/it, loss=0.964, lr=0.000318]\nSteps: 32%|███▏ | 160/500 [13:04<15:08, 2.67s/it, loss=0.964, lr=0.000318]\nSteps: 32%|███▏ | 160/500 [13:04<15:08, 2.67s/it, loss=0.909, lr=0.00032] \nSteps: 32%|███▏ | 161/500 [13:12<23:30, 4.16s/it, loss=0.909, lr=0.00032]\nSteps: 32%|███▏ | 161/500 [13:12<23:30, 4.16s/it, loss=0.874, lr=0.000322]\nSteps: 32%|███▏ | 162/500 [13:13<19:34, 3.47s/it, loss=0.874, lr=0.000322]\nSteps: 32%|███▏ | 162/500 [13:14<19:34, 3.47s/it, loss=0.932, lr=0.000324]\nSteps: 33%|███▎ | 163/500 [13:15<16:49, 2.99s/it, loss=0.932, lr=0.000324]\nSteps: 33%|███▎ | 163/500 [13:15<16:49, 2.99s/it, loss=0.917, lr=0.000326]\nSteps: 33%|███▎ | 164/500 [13:17<14:53, 2.66s/it, loss=0.917, lr=0.000326]\nSteps: 33%|███▎ | 164/500 [13:17<14:53, 2.66s/it, loss=1.07, lr=0.000328] \nSteps: 33%|███▎ | 165/500 [13:25<23:19, 4.18s/it, loss=1.07, lr=0.000328]\nSteps: 33%|███▎ | 165/500 [13:25<23:19, 4.18s/it, loss=0.855, lr=0.00033]\nSteps: 33%|███▎ | 166/500 [13:27<19:24, 3.49s/it, loss=0.855, lr=0.00033]\nSteps: 33%|███▎ | 166/500 [13:27<19:24, 3.49s/it, loss=0.986, lr=0.000332]\nSteps: 33%|███▎ | 167/500 [13:29<16:39, 3.00s/it, loss=0.986, lr=0.000332]\nSteps: 33%|███▎ | 167/500 [13:29<16:39, 3.00s/it, loss=0.814, lr=0.000334]\nSteps: 34%|███▎ | 168/500 [13:31<14:44, 2.66s/it, loss=0.814, lr=0.000334]\nSteps: 34%|███▎ | 168/500 [13:31<14:44, 2.66s/it, loss=0.92, lr=0.000336] \nSteps: 34%|███▍ | 169/500 [13:38<23:13, 4.21s/it, loss=0.92, lr=0.000336]\nSteps: 34%|███▍ | 169/500 [13:38<23:13, 4.21s/it, loss=0.835, lr=0.000338]\nSteps: 34%|███▍ | 170/500 [13:40<19:17, 3.51s/it, loss=0.835, lr=0.000338]\nSteps: 34%|███▍ | 170/500 [13:40<19:17, 3.51s/it, loss=1.08, lr=0.00034] \nSteps: 34%|███▍ | 171/500 [13:42<16:33, 3.02s/it, loss=1.08, lr=0.00034]\nSteps: 34%|███▍ | 171/500 [13:42<16:33, 3.02s/it, loss=0.988, lr=0.000342]\nSteps: 34%|███▍ | 172/500 [13:44<14:37, 2.68s/it, loss=0.988, lr=0.000342]\nSteps: 34%|███▍ | 172/500 [13:44<14:37, 2.68s/it, loss=1, lr=0.000344] \nSteps: 35%|███▍ | 173/500 [13:52<22:40, 4.16s/it, loss=1, lr=0.000344]\nSteps: 35%|███▍ | 173/500 [13:52<22:40, 4.16s/it, loss=1.04, lr=0.000346]\nSteps: 35%|███▍ | 174/500 [13:54<18:52, 3.47s/it, loss=1.04, lr=0.000346]\nSteps: 35%|███▍ | 174/500 [13:54<18:52, 3.47s/it, loss=1.05, lr=0.000348]\nSteps: 35%|███▌ | 175/500 [13:55<16:13, 2.99s/it, loss=1.05, lr=0.000348]\nSteps: 35%|███▌ | 175/500 [13:55<16:13, 2.99s/it, loss=0.996, lr=0.00035]\nSteps: 35%|███▌ | 176/500 [13:57<14:20, 2.66s/it, loss=0.996, lr=0.00035]\nSteps: 35%|███▌ | 176/500 [13:57<14:20, 2.66s/it, loss=1.06, lr=0.000352]\nSteps: 35%|███▌ | 177/500 [14:05<22:27, 4.17s/it, loss=1.06, lr=0.000352]\nSteps: 35%|███▌ | 177/500 [14:05<22:27, 4.17s/it, loss=0.994, lr=0.000354]\nSteps: 36%|███▌ | 178/500 [14:07<18:41, 3.48s/it, loss=0.994, lr=0.000354]\nSteps: 36%|███▌ | 178/500 [14:07<18:41, 3.48s/it, loss=0.987, lr=0.000356]\nSteps: 36%|███▌ | 179/500 [14:09<16:02, 3.00s/it, loss=0.987, lr=0.000356]\nSteps: 36%|███▌ | 179/500 [14:09<16:02, 3.00s/it, loss=0.81, lr=0.000358] \nSteps: 36%|███▌ | 180/500 [14:11<14:11, 2.66s/it, loss=0.81, lr=0.000358]\nSteps: 36%|███▌ | 180/500 [14:11<14:11, 2.66s/it, loss=0.944, lr=0.00036]\nSteps: 36%|███▌ | 181/500 [14:18<22:05, 4.16s/it, loss=0.944, lr=0.00036]\nSteps: 36%|███▌ | 181/500 [14:18<22:05, 4.16s/it, loss=0.856, lr=0.000362]\nSteps: 36%|███▋ | 182/500 [14:20<18:23, 3.47s/it, loss=0.856, lr=0.000362]\nSteps: 36%|███▋ | 182/500 [14:20<18:23, 3.47s/it, loss=0.956, lr=0.000364]\nSteps: 37%|███▋ | 183/500 [14:22<15:48, 2.99s/it, loss=0.956, lr=0.000364]\nSteps: 37%|███▋ | 183/500 [14:22<15:48, 2.99s/it, loss=0.823, lr=0.000366]\nSteps: 37%|███▋ | 184/500 [14:24<13:59, 2.66s/it, loss=0.823, lr=0.000366]\nSteps: 37%|███▋ | 184/500 [14:24<13:59, 2.66s/it, loss=0.963, lr=0.000368]\nSteps: 37%|███▋ | 185/500 [14:31<21:45, 4.15s/it, loss=0.963, lr=0.000368]\nSteps: 37%|███▋ | 185/500 [14:31<21:45, 4.15s/it, loss=0.971, lr=0.00037] \nSteps: 37%|███▋ | 186/500 [14:33<18:07, 3.46s/it, loss=0.971, lr=0.00037]\nSteps: 37%|███▋ | 186/500 [14:33<18:07, 3.46s/it, loss=1.01, lr=0.000372]\nSteps: 37%|███▋ | 187/500 [14:35<15:34, 2.99s/it, loss=1.01, lr=0.000372]\nSteps: 37%|███▋ | 187/500 [14:35<15:34, 2.99s/it, loss=0.855, lr=0.000374]\nSteps: 38%|███▊ | 188/500 [14:37<13:47, 2.65s/it, loss=0.855, lr=0.000374]\nSteps: 38%|███▊ | 188/500 [14:37<13:47, 2.65s/it, loss=1.06, lr=0.000376] \nSteps: 38%|███▊ | 189/500 [14:45<21:33, 4.16s/it, loss=1.06, lr=0.000376]\nSteps: 38%|███▊ | 189/500 [14:45<21:33, 4.16s/it, loss=0.906, lr=0.000378]\nSteps: 38%|███▊ | 190/500 [14:47<17:56, 3.47s/it, loss=0.906, lr=0.000378]\nSteps: 38%|███▊ | 190/500 [14:47<17:56, 3.47s/it, loss=0.957, lr=0.00038] \nSteps: 38%|███▊ | 191/500 [14:49<15:24, 2.99s/it, loss=0.957, lr=0.00038]\nSteps: 38%|███▊ | 191/500 [14:49<15:24, 2.99s/it, loss=0.874, lr=0.000382]\nSteps: 38%|███▊ | 192/500 [14:50<13:38, 2.66s/it, loss=0.874, lr=0.000382]\nSteps: 38%|███▊ | 192/500 [14:50<13:38, 2.66s/it, loss=0.902, lr=0.000384]\nSteps: 39%|███▊ | 193/500 [14:58<21:24, 4.19s/it, loss=0.902, lr=0.000384]\nSteps: 39%|███▊ | 193/500 [14:58<21:24, 4.19s/it, loss=1.06, lr=0.000386] \nSteps: 39%|███▉ | 194/500 [15:00<17:48, 3.49s/it, loss=1.06, lr=0.000386]\nSteps: 39%|███▉ | 194/500 [15:00<17:48, 3.49s/it, loss=0.955, lr=0.000388]\nSteps: 39%|███▉ | 195/500 [15:02<15:16, 3.01s/it, loss=0.955, lr=0.000388]\nSteps: 39%|███▉ | 195/500 [15:02<15:16, 3.01s/it, loss=0.808, lr=0.00039] \nSteps: 39%|███▉ | 196/500 [15:04<13:30, 2.66s/it, loss=0.808, lr=0.00039]\nSteps: 39%|███▉ | 196/500 [15:04<13:30, 2.66s/it, loss=0.925, lr=0.000392]\nSteps: 39%|███▉ | 197/500 [15:12<21:19, 4.22s/it, loss=0.925, lr=0.000392]\nSteps: 39%|███▉ | 197/500 [15:12<21:19, 4.22s/it, loss=0.869, lr=0.000394]\nSteps: 40%|███▉ | 198/500 [15:13<17:42, 3.52s/it, loss=0.869, lr=0.000394]\nSteps: 40%|███▉ | 198/500 [15:13<17:42, 3.52s/it, loss=1.08, lr=0.000396] \nSteps: 40%|███▉ | 199/500 [15:15<15:10, 3.02s/it, loss=1.08, lr=0.000396]\nSteps: 40%|███▉ | 199/500 [15:15<15:10, 3.02s/it, loss=0.829, lr=0.000398]\nSteps: 40%|████ | 200/500 [15:17<13:23, 2.68s/it, loss=0.829, lr=0.000398]\nSteps: 40%|████ | 200/500 [15:17<13:23, 2.68s/it, loss=1.05, lr=0.0004] \nSteps: 40%|████ | 201/500 [15:25<20:55, 4.20s/it, loss=1.05, lr=0.0004]\nSteps: 40%|████ | 201/500 [15:25<20:55, 4.20s/it, loss=1.03, lr=0.0004]\nSteps: 40%|████ | 202/500 [15:27<17:23, 3.50s/it, loss=1.03, lr=0.0004]\nSteps: 40%|████ | 202/500 [15:27<17:23, 3.50s/it, loss=1.06, lr=0.0004]\nSteps: 41%|████ | 203/500 [15:29<14:54, 3.01s/it, loss=1.06, lr=0.0004]\nSteps: 41%|████ | 203/500 [15:29<14:54, 3.01s/it, loss=0.846, lr=0.0004]\nSteps: 41%|████ | 204/500 [15:31<13:10, 2.67s/it, loss=0.846, lr=0.0004]\nSteps: 41%|████ | 204/500 [15:31<13:10, 2.67s/it, loss=0.921, lr=0.0004]\nSteps: 41%|████ | 205/500 [15:38<20:37, 4.19s/it, loss=0.921, lr=0.0004]\nSteps: 41%|████ | 205/500 [15:38<20:37, 4.19s/it, loss=0.856, lr=0.0004]\nSteps: 41%|████ | 206/500 [15:40<17:08, 3.50s/it, loss=0.856, lr=0.0004]\nSteps: 41%|████ | 206/500 [15:40<17:08, 3.50s/it, loss=1.06, lr=0.0004] \nSteps: 41%|████▏ | 207/500 [15:42<14:41, 3.01s/it, loss=1.06, lr=0.0004]\nSteps: 41%|████▏ | 207/500 [15:42<14:41, 3.01s/it, loss=0.81, lr=0.000399]\nSteps: 42%|████▏ | 208/500 [15:44<12:59, 2.67s/it, loss=0.81, lr=0.000399]\nSteps: 42%|████▏ | 208/500 [15:44<12:59, 2.67s/it, loss=0.961, lr=0.000399]\nSteps: 42%|████▏ | 209/500 [15:52<20:15, 4.18s/it, loss=0.961, lr=0.000399]\nSteps: 42%|████▏ | 209/500 [15:52<20:15, 4.18s/it, loss=0.809, lr=0.000399]\nSteps: 42%|████▏ | 210/500 [15:54<16:50, 3.48s/it, loss=0.809, lr=0.000399]\nSteps: 42%|████▏ | 210/500 [15:54<16:50, 3.48s/it, loss=0.983, lr=0.000399]\nSteps: 42%|████▏ | 211/500 [15:55<14:27, 3.00s/it, loss=0.983, lr=0.000399]\nSteps: 42%|████▏ | 211/500 [15:55<14:27, 3.00s/it, loss=0.865, lr=0.000399]\nSteps: 42%|████▏ | 212/500 [15:57<12:46, 2.66s/it, loss=0.865, lr=0.000399]\nSteps: 42%|████▏ | 212/500 [15:57<12:46, 2.66s/it, loss=0.927, lr=0.000398]\nSteps: 43%|████▎ | 213/500 [16:05<19:55, 4.17s/it, loss=0.927, lr=0.000398]\nSteps: 43%|████▎ | 213/500 [16:05<19:55, 4.17s/it, loss=0.799, lr=0.000398]\nSteps: 43%|████▎ | 214/500 [16:07<16:34, 3.48s/it, loss=0.799, lr=0.000398]\nSteps: 43%|████▎ | 214/500 [16:07<16:34, 3.48s/it, loss=1.02, lr=0.000398] \nSteps: 43%|████▎ | 215/500 [16:09<14:13, 3.00s/it, loss=1.02, lr=0.000398]\nSteps: 43%|████▎ | 215/500 [16:09<14:13, 3.00s/it, loss=0.864, lr=0.000398]\nSteps: 43%|████▎ | 216/500 [16:11<12:34, 2.66s/it, loss=0.864, lr=0.000398]\nSteps: 43%|████▎ | 216/500 [16:11<12:34, 2.66s/it, loss=0.976, lr=0.000397]\nSteps: 43%|████▎ | 217/500 [16:18<19:35, 4.15s/it, loss=0.976, lr=0.000397]\nSteps: 43%|████▎ | 217/500 [16:18<19:35, 4.15s/it, loss=0.859, lr=0.000397]\nSteps: 44%|████▎ | 218/500 [16:20<16:18, 3.47s/it, loss=0.859, lr=0.000397]\nSteps: 44%|████▎ | 218/500 [16:20<16:18, 3.47s/it, loss=0.9, lr=0.000396] \nSteps: 44%|████▍ | 219/500 [16:22<14:00, 2.99s/it, loss=0.9, lr=0.000396]\nSteps: 44%|████▍ | 219/500 [16:22<14:00, 2.99s/it, loss=0.935, lr=0.000396]\nSteps: 44%|████▍ | 220/500 [16:24<12:23, 2.66s/it, loss=0.935, lr=0.000396]\nSteps: 44%|████▍ | 220/500 [16:24<12:23, 2.66s/it, loss=0.919, lr=0.000396]\nSteps: 44%|████▍ | 221/500 [16:31<19:15, 4.14s/it, loss=0.919, lr=0.000396]\nSteps: 44%|████▍ | 221/500 [16:31<19:15, 4.14s/it, loss=0.849, lr=0.000395]\nSteps: 44%|████▍ | 222/500 [16:33<16:01, 3.46s/it, loss=0.849, lr=0.000395]\nSteps: 44%|████▍ | 222/500 [16:33<16:01, 3.46s/it, loss=0.985, lr=0.000395]\nSteps: 45%|████▍ | 223/500 [16:35<13:46, 2.98s/it, loss=0.985, lr=0.000395]\nSteps: 45%|████▍ | 223/500 [16:35<13:46, 2.98s/it, loss=0.798, lr=0.000394]\nSteps: 45%|████▍ | 224/500 [16:37<12:11, 2.65s/it, loss=0.798, lr=0.000394]\nSteps: 45%|████▍ | 224/500 [16:37<12:11, 2.65s/it, loss=0.896, lr=0.000394]\nSteps: 45%|████▌ | 225/500 [16:45<19:01, 4.15s/it, loss=0.896, lr=0.000394]\nSteps: 45%|████▌ | 225/500 [16:45<19:01, 4.15s/it, loss=0.772, lr=0.000393]\nSteps: 45%|████▌ | 226/500 [16:47<15:50, 3.47s/it, loss=0.772, lr=0.000393]\nSteps: 45%|████▌ | 226/500 [16:47<15:50, 3.47s/it, loss=0.968, lr=0.000393]\nSteps: 45%|████▌ | 227/500 [16:48<13:36, 2.99s/it, loss=0.968, lr=0.000393]\nSteps: 45%|████▌ | 227/500 [16:48<13:36, 2.99s/it, loss=0.943, lr=0.000392]\nSteps: 46%|████▌ | 228/500 [16:50<12:01, 2.65s/it, loss=0.943, lr=0.000392]\nSteps: 46%|████▌ | 228/500 [16:50<12:01, 2.65s/it, loss=0.951, lr=0.000391]\nSteps: 46%|████▌ | 229/500 [16:58<18:52, 4.18s/it, loss=0.951, lr=0.000391]\nSteps: 46%|████▌ | 229/500 [16:58<18:52, 4.18s/it, loss=0.839, lr=0.000391]\nSteps: 46%|████▌ | 230/500 [17:00<15:41, 3.49s/it, loss=0.839, lr=0.000391]\nSteps: 46%|████▌ | 230/500 [17:00<15:41, 3.49s/it, loss=1.02, lr=0.00039] \nSteps: 46%|████▌ | 231/500 [17:02<13:27, 3.00s/it, loss=1.02, lr=0.00039]\nSteps: 46%|████▌ | 231/500 [17:02<13:27, 3.00s/it, loss=0.854, lr=0.00039]\nSteps: 46%|████▋ | 232/500 [17:04<11:53, 2.66s/it, loss=0.854, lr=0.00039]\nSteps: 46%|████▋ | 232/500 [17:04<11:53, 2.66s/it, loss=0.958, lr=0.000389]\nSteps: 47%|████▋ | 233/500 [17:11<18:44, 4.21s/it, loss=0.958, lr=0.000389]\nSteps: 47%|████▋ | 233/500 [17:11<18:44, 4.21s/it, loss=1.06, lr=0.000388] \nSteps: 47%|████▋ | 234/500 [17:13<15:33, 3.51s/it, loss=1.06, lr=0.000388]\nSteps: 47%|████▋ | 234/500 [17:13<15:33, 3.51s/it, loss=1.07, lr=0.000387]\nSteps: 47%|████▋ | 235/500 [17:15<13:20, 3.02s/it, loss=1.07, lr=0.000387]\nSteps: 47%|████▋ | 235/500 [17:15<13:20, 3.02s/it, loss=0.996, lr=0.000387]\nSteps: 47%|████▋ | 236/500 [17:17<11:46, 2.68s/it, loss=0.996, lr=0.000387]\nSteps: 47%|████▋ | 236/500 [17:17<11:46, 2.68s/it, loss=0.889, lr=0.000386]\nSteps: 47%|████▋ | 237/500 [17:25<18:25, 4.20s/it, loss=0.889, lr=0.000386]\nSteps: 47%|████▋ | 237/500 [17:25<18:25, 4.20s/it, loss=0.789, lr=0.000385]\nSteps: 48%|████▊ | 238/500 [17:27<15:18, 3.51s/it, loss=0.789, lr=0.000385]\nSteps: 48%|████▊ | 238/500 [17:27<15:18, 3.51s/it, loss=1.04, lr=0.000384] \nSteps: 48%|████▊ | 239/500 [17:29<13:07, 3.02s/it, loss=1.04, lr=0.000384]\nSteps: 48%|████▊ | 239/500 [17:29<13:07, 3.02s/it, loss=0.85, lr=0.000384]\nSteps: 48%|████▊ | 240/500 [17:31<11:35, 2.67s/it, loss=0.85, lr=0.000384]\nSteps: 48%|████▊ | 240/500 [17:31<11:35, 2.67s/it, loss=0.976, lr=0.000383]\nSteps: 48%|████▊ | 241/500 [17:38<18:00, 4.17s/it, loss=0.976, lr=0.000383]\nSteps: 48%|████▊ | 241/500 [17:38<18:00, 4.17s/it, loss=0.842, lr=0.000382]\nSteps: 48%|████▊ | 242/500 [17:40<14:57, 3.48s/it, loss=0.842, lr=0.000382]\nSteps: 48%|████▊ | 242/500 [17:40<14:57, 3.48s/it, loss=1.01, lr=0.000381] \nSteps: 49%|████▊ | 243/500 [17:42<12:50, 3.00s/it, loss=1.01, lr=0.000381]\nSteps: 49%|████▊ | 243/500 [17:42<12:50, 3.00s/it, loss=0.848, lr=0.00038]\nSteps: 49%|████▉ | 244/500 [17:44<11:21, 2.66s/it, loss=0.848, lr=0.00038]\nSteps: 49%|████▉ | 244/500 [17:44<11:21, 2.66s/it, loss=1.07, lr=0.000379]\nSteps: 49%|████▉ | 245/500 [17:51<17:42, 4.17s/it, loss=1.07, lr=0.000379]\nSteps: 49%|████▉ | 245/500 [17:51<17:42, 4.17s/it, loss=1.05, lr=0.000378]\nSteps: 49%|████▉ | 246/500 [17:53<14:43, 3.48s/it, loss=1.05, lr=0.000378]\nSteps: 49%|████▉ | 246/500 [17:53<14:43, 3.48s/it, loss=0.908, lr=0.000377]\nSteps: 49%|████▉ | 247/500 [17:55<12:38, 3.00s/it, loss=0.908, lr=0.000377]\nSteps: 49%|████▉ | 247/500 [17:55<12:38, 3.00s/it, loss=0.8, lr=0.000376] \nSteps: 50%|████▉ | 248/500 [17:57<11:10, 2.66s/it, loss=0.8, lr=0.000376]\nSteps: 50%|████▉ | 248/500 [17:57<11:10, 2.66s/it, loss=1.07, lr=0.000375]\nSteps: 50%|████▉ | 249/500 [18:05<17:29, 4.18s/it, loss=1.07, lr=0.000375]\nSteps: 50%|████▉ | 249/500 [18:05<17:29, 4.18s/it, loss=0.966, lr=0.000374]\nSteps: 50%|█████ | 250/500 [18:07<14:31, 3.49s/it, loss=0.966, lr=0.000374]\nSteps: 50%|█████ | 250/500 [18:07<14:31, 3.49s/it, loss=1.07, lr=0.000373] \nSteps: 50%|█████ | 251/500 [18:09<12:27, 3.00s/it, loss=1.07, lr=0.000373]\nSteps: 50%|█████ | 251/500 [18:09<12:27, 3.00s/it, loss=0.789, lr=0.000372]\nSteps: 50%|█████ | 252/500 [18:10<11:00, 2.66s/it, loss=0.789, lr=0.000372]\nSteps: 50%|█████ | 252/500 [18:10<11:00, 2.66s/it, loss=0.945, lr=0.000371]\nSteps: 51%|█████ | 253/500 [18:18<17:08, 4.16s/it, loss=0.945, lr=0.000371]\nSteps: 51%|█████ | 253/500 [18:18<17:08, 4.16s/it, loss=0.83, lr=0.00037] \nSteps: 51%|█████ | 254/500 [18:20<14:15, 3.48s/it, loss=0.83, lr=0.00037]\nSteps: 51%|█████ | 254/500 [18:20<14:15, 3.48s/it, loss=0.999, lr=0.000369]\nSteps: 51%|█████ | 255/500 [18:22<12:13, 3.00s/it, loss=0.999, lr=0.000369]\nSteps: 51%|█████ | 255/500 [18:22<12:13, 3.00s/it, loss=0.883, lr=0.000368]\nSteps: 51%|█████ | 256/500 [18:24<10:48, 2.66s/it, loss=0.883, lr=0.000368]\nSteps: 51%|█████ | 256/500 [18:24<10:48, 2.66s/it, loss=1.07, lr=0.000367] \nSteps: 51%|█████▏ | 257/500 [18:32<17:04, 4.21s/it, loss=1.07, lr=0.000367]\nSteps: 51%|█████▏ | 257/500 [18:32<17:04, 4.21s/it, loss=0.81, lr=0.000365]\nSteps: 52%|█████▏ | 258/500 [18:33<14:10, 3.51s/it, loss=0.81, lr=0.000365]\nSteps: 52%|█████▏ | 258/500 [18:33<14:10, 3.51s/it, loss=0.94, lr=0.000364]\nSteps: 52%|█████▏ | 259/500 [18:35<12:08, 3.02s/it, loss=0.94, lr=0.000364]\nSteps: 52%|█████▏ | 259/500 [18:35<12:08, 3.02s/it, loss=0.963, lr=0.000363]\nSteps: 52%|█████▏ | 260/500 [18:37<10:42, 2.68s/it, loss=0.963, lr=0.000363]\nSteps: 52%|█████▏ | 260/500 [18:37<10:42, 2.68s/it, loss=0.942, lr=0.000362]\nSteps: 52%|█████▏ | 261/500 [18:45<16:53, 4.24s/it, loss=0.942, lr=0.000362]\nSteps: 52%|█████▏ | 261/500 [18:45<16:53, 4.24s/it, loss=0.962, lr=0.000361]\nSteps: 52%|█████▏ | 262/500 [18:47<14:00, 3.53s/it, loss=0.962, lr=0.000361]\nSteps: 52%|█████▏ | 262/500 [18:47<14:00, 3.53s/it, loss=0.922, lr=0.000359]\nSteps: 53%|█████▎ | 263/500 [18:49<11:58, 3.03s/it, loss=0.922, lr=0.000359]\nSteps: 53%|█████▎ | 263/500 [18:49<11:58, 3.03s/it, loss=0.8, lr=0.000358] \nSteps: 53%|█████▎ | 264/500 [18:51<10:33, 2.69s/it, loss=0.8, lr=0.000358]\nSteps: 53%|█████▎ | 264/500 [18:51<10:33, 2.69s/it, loss=0.954, lr=0.000357]\nSteps: 53%|█████▎ | 265/500 [18:58<16:26, 4.20s/it, loss=0.954, lr=0.000357]\nSteps: 53%|█████▎ | 265/500 [18:58<16:26, 4.20s/it, loss=0.852, lr=0.000355]\nSteps: 53%|█████▎ | 266/500 [19:00<13:39, 3.50s/it, loss=0.852, lr=0.000355]\nSteps: 53%|█████▎ | 266/500 [19:00<13:39, 3.50s/it, loss=0.9, lr=0.000354] \nSteps: 53%|█████▎ | 267/500 [19:02<11:42, 3.02s/it, loss=0.9, lr=0.000354]\nSteps: 53%|█████▎ | 267/500 [19:02<11:42, 3.02s/it, loss=0.838, lr=0.000353]\nSteps: 54%|█████▎ | 268/500 [19:04<10:20, 2.67s/it, loss=0.838, lr=0.000353]\nSteps: 54%|█████▎ | 268/500 [19:04<10:20, 2.67s/it, loss=1.07, lr=0.000351] \nSteps: 54%|█████▍ | 269/500 [19:12<16:02, 4.17s/it, loss=1.07, lr=0.000351]\nSteps: 54%|█████▍ | 269/500 [19:12<16:02, 4.17s/it, loss=0.983, lr=0.00035]\nSteps: 54%|█████▍ | 270/500 [19:14<13:20, 3.48s/it, loss=0.983, lr=0.00035]\nSteps: 54%|█████▍ | 270/500 [19:14<13:20, 3.48s/it, loss=0.957, lr=0.000349]\nSteps: 54%|█████▍ | 271/500 [19:15<11:26, 3.00s/it, loss=0.957, lr=0.000349]\nSteps: 54%|█████▍ | 271/500 [19:15<11:26, 3.00s/it, loss=0.828, lr=0.000347]\nSteps: 54%|█████▍ | 272/500 [19:17<10:06, 2.66s/it, loss=0.828, lr=0.000347]\nSteps: 54%|█████▍ | 272/500 [19:17<10:06, 2.66s/it, loss=0.946, lr=0.000346]\nSteps: 55%|█████▍ | 273/500 [19:25<15:43, 4.16s/it, loss=0.946, lr=0.000346]\nSteps: 55%|█████▍ | 273/500 [19:25<15:43, 4.16s/it, loss=1.01, lr=0.000344] \nSteps: 55%|█████▍ | 274/500 [19:27<13:04, 3.47s/it, loss=1.01, lr=0.000344]\nSteps: 55%|█████▍ | 274/500 [19:27<13:04, 3.47s/it, loss=0.915, lr=0.000343]\nSteps: 55%|█████▌ | 275/500 [19:29<11:13, 2.99s/it, loss=0.915, lr=0.000343]\nSteps: 55%|█████▌ | 275/500 [19:29<11:13, 2.99s/it, loss=0.881, lr=0.000341]\nSteps: 55%|█████▌ | 276/500 [19:31<09:55, 2.66s/it, loss=0.881, lr=0.000341]\nSteps: 55%|█████▌ | 276/500 [19:31<09:55, 2.66s/it, loss=0.896, lr=0.00034] \nSteps: 55%|█████▌ | 277/500 [19:38<15:23, 4.14s/it, loss=0.896, lr=0.00034]\nSteps: 55%|█████▌ | 277/500 [19:38<15:23, 4.14s/it, loss=0.863, lr=0.000338]\nSteps: 56%|█████▌ | 278/500 [19:40<12:48, 3.46s/it, loss=0.863, lr=0.000338]\nSteps: 56%|█████▌ | 278/500 [19:40<12:48, 3.46s/it, loss=0.968, lr=0.000337]\nSteps: 56%|█████▌ | 279/500 [19:42<10:59, 2.99s/it, loss=0.968, lr=0.000337]\nSteps: 56%|█████▌ | 279/500 [19:42<10:59, 2.99s/it, loss=0.817, lr=0.000335]\nSteps: 56%|█████▌ | 280/500 [19:44<09:43, 2.65s/it, loss=0.817, lr=0.000335]\nSteps: 56%|█████▌ | 280/500 [19:44<09:43, 2.65s/it, loss=1.07, lr=0.000334] \nSteps: 56%|█████▌ | 281/500 [19:51<15:08, 4.15s/it, loss=1.07, lr=0.000334]\nSteps: 56%|█████▌ | 281/500 [19:51<15:08, 4.15s/it, loss=0.795, lr=0.000332]\nSteps: 56%|█████▋ | 282/500 [19:53<12:35, 3.46s/it, loss=0.795, lr=0.000332]\nSteps: 56%|█████▋ | 282/500 [19:53<12:35, 3.46s/it, loss=0.99, lr=0.000331] \nSteps: 57%|█████▋ | 283/500 [19:55<10:48, 2.99s/it, loss=0.99, lr=0.000331]\nSteps: 57%|█████▋ | 283/500 [19:55<10:48, 2.99s/it, loss=0.844, lr=0.000329]\nSteps: 57%|█████▋ | 284/500 [19:57<09:32, 2.65s/it, loss=0.844, lr=0.000329]\nSteps: 57%|█████▋ | 284/500 [19:57<09:32, 2.65s/it, loss=0.94, lr=0.000327] \nSteps: 57%|█████▋ | 285/500 [20:05<14:54, 4.16s/it, loss=0.94, lr=0.000327]\nSteps: 57%|█████▋ | 285/500 [20:05<14:54, 4.16s/it, loss=0.9, lr=0.000326] \nSteps: 57%|█████▋ | 286/500 [20:07<12:24, 3.48s/it, loss=0.9, lr=0.000326]\nSteps: 57%|█████▋ | 286/500 [20:07<12:24, 3.48s/it, loss=1.06, lr=0.000324]\nSteps: 57%|█████▋ | 287/500 [20:09<10:38, 3.00s/it, loss=1.06, lr=0.000324]\nSteps: 57%|█████▋ | 287/500 [20:09<10:38, 3.00s/it, loss=1.02, lr=0.000323]\nSteps: 58%|█████▊ | 288/500 [20:10<09:23, 2.66s/it, loss=1.02, lr=0.000323]\nSteps: 58%|█████▊ | 288/500 [20:10<09:23, 2.66s/it, loss=1.03, lr=0.000321]\nSteps: 58%|█████▊ | 289/500 [20:18<14:35, 4.15s/it, loss=1.03, lr=0.000321]\nSteps: 58%|█████▊ | 289/500 [20:18<14:35, 4.15s/it, loss=1.05, lr=0.000319]\nSteps: 58%|█████▊ | 290/500 [20:20<12:07, 3.47s/it, loss=1.05, lr=0.000319]\nSteps: 58%|█████▊ | 290/500 [20:20<12:07, 3.47s/it, loss=0.899, lr=0.000318]\nSteps: 58%|█████▊ | 291/500 [20:22<10:24, 2.99s/it, loss=0.899, lr=0.000318]\nSteps: 58%|█████▊ | 291/500 [20:22<10:24, 2.99s/it, loss=1.03, lr=0.000316] \nSteps: 58%|█████▊ | 292/500 [20:24<09:11, 2.65s/it, loss=1.03, lr=0.000316]\nSteps: 58%|█████▊ | 292/500 [20:24<09:11, 2.65s/it, loss=1.03, lr=0.000314]\nSteps: 59%|█████▊ | 293/500 [20:31<14:28, 4.19s/it, loss=1.03, lr=0.000314]\nSteps: 59%|█████▊ | 293/500 [20:31<14:28, 4.19s/it, loss=0.821, lr=0.000312]\nSteps: 59%|█████▉ | 294/500 [20:33<12:00, 3.50s/it, loss=0.821, lr=0.000312]\nSteps: 59%|█████▉ | 294/500 [20:33<12:00, 3.50s/it, loss=0.884, lr=0.000311]\nSteps: 59%|█████▉ | 295/500 [20:35<10:17, 3.01s/it, loss=0.884, lr=0.000311]\nSteps: 59%|█████▉ | 295/500 [20:35<10:17, 3.01s/it, loss=0.792, lr=0.000309]\nSteps: 59%|█████▉ | 296/500 [20:37<09:04, 2.67s/it, loss=0.792, lr=0.000309]\nSteps: 59%|█████▉ | 296/500 [20:37<09:04, 2.67s/it, loss=1.01, lr=0.000307] \nSteps: 59%|█████▉ | 297/500 [20:45<14:02, 4.15s/it, loss=1.01, lr=0.000307]\nSteps: 59%|█████▉ | 297/500 [20:45<14:02, 4.15s/it, loss=0.787, lr=0.000305]\nSteps: 60%|█████▉ | 298/500 [20:47<11:40, 3.47s/it, loss=0.787, lr=0.000305]\nSteps: 60%|█████▉ | 298/500 [20:47<11:40, 3.47s/it, loss=0.909, lr=0.000304]\nSteps: 60%|█████▉ | 299/500 [20:48<10:00, 2.99s/it, loss=0.909, lr=0.000304]\nSteps: 60%|█████▉ | 299/500 [20:48<10:00, 2.99s/it, loss=0.832, lr=0.000302]\nSteps: 60%|██████ | 300/500 [20:50<08:50, 2.65s/it, loss=0.832, lr=0.000302]\nSteps: 60%|██████ | 300/500 [20:50<08:50, 2.65s/it, loss=0.945, lr=0.0003] \nSteps: 60%|██████ | 301/500 [20:58<13:47, 4.16s/it, loss=0.945, lr=0.0003]\nSteps: 60%|██████ | 301/500 [20:58<13:47, 4.16s/it, loss=0.866, lr=0.000298]\nSteps: 60%|██████ | 302/500 [21:00<11:27, 3.47s/it, loss=0.866, lr=0.000298]\nSteps: 60%|██████ | 302/500 [21:00<11:27, 3.47s/it, loss=0.905, lr=0.000296]\nSteps: 61%|██████ | 303/500 [21:02<09:49, 2.99s/it, loss=0.905, lr=0.000296]\nSteps: 61%|██████ | 303/500 [21:02<09:49, 2.99s/it, loss=0.818, lr=0.000295]\nSteps: 61%|██████ | 304/500 [21:04<08:40, 2.66s/it, loss=0.818, lr=0.000295]\nSteps: 61%|██████ | 304/500 [21:04<08:40, 2.66s/it, loss=0.912, lr=0.000293]\nSteps: 61%|██████ | 305/500 [21:11<13:32, 4.16s/it, loss=0.912, lr=0.000293]\nSteps: 61%|██████ | 305/500 [21:11<13:32, 4.16s/it, loss=0.784, lr=0.000291]\nSteps: 61%|██████ | 306/500 [21:13<11:14, 3.48s/it, loss=0.784, lr=0.000291]\nSteps: 61%|██████ | 306/500 [21:13<11:14, 3.48s/it, loss=1.03, lr=0.000289] \nSteps: 61%|██████▏ | 307/500 [21:15<09:38, 3.00s/it, loss=1.03, lr=0.000289]\nSteps: 61%|██████▏ | 307/500 [21:15<09:38, 3.00s/it, loss=1.05, lr=0.000287]\nSteps: 62%|██████▏ | 308/500 [21:17<08:30, 2.66s/it, loss=1.05, lr=0.000287]\nSteps: 62%|██████▏ | 308/500 [21:17<08:30, 2.66s/it, loss=1.04, lr=0.000285]\nSteps: 62%|██████▏ | 309/500 [21:25<13:15, 4.16s/it, loss=1.04, lr=0.000285]\nSteps: 62%|██████▏ | 309/500 [21:25<13:15, 4.16s/it, loss=1.06, lr=0.000283]\nSteps: 62%|██████▏ | 310/500 [21:26<11:00, 3.47s/it, loss=1.06, lr=0.000283]\nSteps: 62%|██████▏ | 310/500 [21:26<11:00, 3.47s/it, loss=0.99, lr=0.000281]\nSteps: 62%|██████▏ | 311/500 [21:28<09:25, 2.99s/it, loss=0.99, lr=0.000281]\nSteps: 62%|██████▏ | 311/500 [21:28<09:25, 2.99s/it, loss=0.86, lr=0.000279]\nSteps: 62%|██████▏ | 312/500 [21:30<08:19, 2.66s/it, loss=0.86, lr=0.000279]\nSteps: 62%|██████▏ | 312/500 [21:30<08:19, 2.66s/it, loss=0.877, lr=0.000278]\nSteps: 63%|██████▎ | 313/500 [21:38<13:00, 4.17s/it, loss=0.877, lr=0.000278]\nSteps: 63%|██████▎ | 313/500 [21:38<13:00, 4.17s/it, loss=0.82, lr=0.000276] \nSteps: 63%|██████▎ | 314/500 [21:40<10:47, 3.48s/it, loss=0.82, lr=0.000276]\nSteps: 63%|██████▎ | 314/500 [21:40<10:47, 3.48s/it, loss=0.89, lr=0.000274]\nSteps: 63%|██████▎ | 315/500 [21:42<09:15, 3.00s/it, loss=0.89, lr=0.000274]\nSteps: 63%|██████▎ | 315/500 [21:42<09:15, 3.00s/it, loss=0.855, lr=0.000272]\nSteps: 63%|██████▎ | 316/500 [21:43<08:09, 2.66s/it, loss=0.855, lr=0.000272]\nSteps: 63%|██████▎ | 316/500 [21:43<08:09, 2.66s/it, loss=1.01, lr=0.00027] \nSteps: 63%|██████▎ | 317/500 [21:51<12:43, 4.17s/it, loss=1.01, lr=0.00027]\nSteps: 63%|██████▎ | 317/500 [21:51<12:43, 4.17s/it, loss=0.9, lr=0.000268]\nSteps: 64%|██████▎ | 318/500 [21:53<10:33, 3.48s/it, loss=0.9, lr=0.000268]\nSteps: 64%|██████▎ | 318/500 [21:53<10:33, 3.48s/it, loss=0.966, lr=0.000266]\nSteps: 64%|██████▍ | 319/500 [21:55<09:02, 3.00s/it, loss=0.966, lr=0.000266]\nSteps: 64%|██████▍ | 319/500 [21:55<09:02, 3.00s/it, loss=0.968, lr=0.000264]\nSteps: 64%|██████▍ | 320/500 [21:57<07:58, 2.66s/it, loss=0.968, lr=0.000264]\nSteps: 64%|██████▍ | 320/500 [21:57<07:58, 2.66s/it, loss=0.891, lr=0.000262]\nSteps: 64%|██████▍ | 321/500 [22:04<12:20, 4.14s/it, loss=0.891, lr=0.000262]\nSteps: 64%|██████▍ | 321/500 [22:04<12:20, 4.14s/it, loss=0.787, lr=0.00026] \nSteps: 64%|██████▍ | 322/500 [22:06<10:15, 3.46s/it, loss=0.787, lr=0.00026]\nSteps: 64%|██████▍ | 322/500 [22:06<10:15, 3.46s/it, loss=0.878, lr=0.000258]\nSteps: 65%|██████▍ | 323/500 [22:08<08:47, 2.98s/it, loss=0.878, lr=0.000258]\nSteps: 65%|██████▍ | 323/500 [22:08<08:47, 2.98s/it, loss=0.852, lr=0.000256]\nSteps: 65%|██████▍ | 324/500 [22:10<07:46, 2.65s/it, loss=0.852, lr=0.000256]\nSteps: 65%|██████▍ | 324/500 [22:10<07:46, 2.65s/it, loss=1.02, lr=0.000254] \nSteps: 65%|██████▌ | 325/500 [22:18<12:03, 4.14s/it, loss=1.02, lr=0.000254]\nSteps: 65%|██████▌ | 325/500 [22:18<12:03, 4.14s/it, loss=0.878, lr=0.000252]\nSteps: 65%|██████▌ | 326/500 [22:19<10:01, 3.46s/it, loss=0.878, lr=0.000252]\nSteps: 65%|██████▌ | 326/500 [22:19<10:01, 3.46s/it, loss=0.878, lr=0.00025] \nSteps: 65%|██████▌ | 327/500 [22:21<08:35, 2.98s/it, loss=0.878, lr=0.00025]\nSteps: 65%|██████▌ | 327/500 [22:21<08:35, 2.98s/it, loss=0.845, lr=0.000248]\nSteps: 66%|██████▌ | 328/500 [22:23<07:35, 2.65s/it, loss=0.845, lr=0.000248]\nSteps: 66%|██████▌ | 328/500 [22:23<07:35, 2.65s/it, loss=0.905, lr=0.000246]\nSteps: 66%|██████▌ | 329/500 [22:31<11:55, 4.18s/it, loss=0.905, lr=0.000246]\nSteps: 66%|██████▌ | 329/500 [22:31<11:55, 4.18s/it, loss=1.05, lr=0.000244] \nSteps: 66%|██████▌ | 330/500 [22:33<09:53, 3.49s/it, loss=1.05, lr=0.000244]\nSteps: 66%|██████▌ | 330/500 [22:33<09:53, 3.49s/it, loss=0.936, lr=0.000242]\nSteps: 66%|██████▌ | 331/500 [22:35<08:27, 3.00s/it, loss=0.936, lr=0.000242]\nSteps: 66%|██████▌ | 331/500 [22:35<08:27, 3.00s/it, loss=0.834, lr=0.00024] \nSteps: 66%|██████▋ | 332/500 [22:37<07:27, 2.67s/it, loss=0.834, lr=0.00024]\nSteps: 66%|██████▋ | 332/500 [22:37<07:27, 2.67s/it, loss=1, lr=0.000237] \nSteps: 67%|██████▋ | 333/500 [22:44<11:31, 4.14s/it, loss=1, lr=0.000237]\nSteps: 67%|██████▋ | 333/500 [22:44<11:31, 4.14s/it, loss=0.791, lr=0.000235]\nSteps: 67%|██████▋ | 334/500 [22:46<09:34, 3.46s/it, loss=0.791, lr=0.000235]\nSteps: 67%|██████▋ | 334/500 [22:46<09:34, 3.46s/it, loss=0.893, lr=0.000233]\nSteps: 67%|██████▋ | 335/500 [22:48<08:12, 2.98s/it, loss=0.893, lr=0.000233]\nSteps: 67%|██████▋ | 335/500 [22:48<08:12, 2.98s/it, loss=1.03, lr=0.000231] \nSteps: 67%|██████▋ | 336/500 [22:50<07:14, 2.65s/it, loss=1.03, lr=0.000231]\nSteps: 67%|██████▋ | 336/500 [22:50<07:14, 2.65s/it, loss=1.03, lr=0.000229]\nSteps: 67%|██████▋ | 337/500 [22:58<11:23, 4.20s/it, loss=1.03, lr=0.000229]\nSteps: 67%|██████▋ | 337/500 [22:58<11:23, 4.20s/it, loss=1.03, lr=0.000227]\nSteps: 68%|██████▊ | 338/500 [22:59<09:26, 3.50s/it, loss=1.03, lr=0.000227]\nSteps: 68%|██████▊ | 338/500 [22:59<09:26, 3.50s/it, loss=0.882, lr=0.000225]\nSteps: 68%|██████▊ | 339/500 [23:01<08:04, 3.01s/it, loss=0.882, lr=0.000225]\nSteps: 68%|██████▊ | 339/500 [23:01<08:04, 3.01s/it, loss=0.792, lr=0.000223]\nSteps: 68%|██████▊ | 340/500 [23:03<07:07, 2.67s/it, loss=0.792, lr=0.000223]\nSteps: 68%|██████▊ | 340/500 [23:03<07:07, 2.67s/it, loss=0.974, lr=0.000221]\nSteps: 68%|██████▊ | 341/500 [23:11<10:57, 4.14s/it, loss=0.974, lr=0.000221]\nSteps: 68%|██████▊ | 341/500 [23:11<10:57, 4.14s/it, loss=0.83, lr=0.000219] \nSteps: 68%|██████▊ | 342/500 [23:13<09:06, 3.46s/it, loss=0.83, lr=0.000219]\nSteps: 68%|██████▊ | 342/500 [23:13<09:06, 3.46s/it, loss=0.874, lr=0.000217]\nSteps: 69%|██████▊ | 343/500 [23:14<07:48, 2.98s/it, loss=0.874, lr=0.000217]\nSteps: 69%|██████▊ | 343/500 [23:15<07:48, 2.98s/it, loss=0.789, lr=0.000215]\nSteps: 69%|██████▉ | 344/500 [23:16<06:53, 2.65s/it, loss=0.789, lr=0.000215]\nSteps: 69%|██████▉ | 344/500 [23:16<06:53, 2.65s/it, loss=0.975, lr=0.000213]\nSteps: 69%|██████▉ | 345/500 [23:24<10:41, 4.14s/it, loss=0.975, lr=0.000213]\nSteps: 69%|██████▉ | 345/500 [23:24<10:41, 4.14s/it, loss=0.838, lr=0.00021] \nSteps: 69%|██████▉ | 346/500 [23:26<08:52, 3.46s/it, loss=0.838, lr=0.00021]\nSteps: 69%|██████▉ | 346/500 [23:26<08:52, 3.46s/it, loss=1.02, lr=0.000208]\nSteps: 69%|██████▉ | 347/500 [23:28<07:36, 2.98s/it, loss=1.02, lr=0.000208]\nSteps: 69%|██████▉ | 347/500 [23:28<07:36, 2.98s/it, loss=0.815, lr=0.000206]\nSteps: 70%|██████▉ | 348/500 [23:30<06:43, 2.65s/it, loss=0.815, lr=0.000206]\nSteps: 70%|██████▉ | 348/500 [23:30<06:43, 2.65s/it, loss=0.865, lr=0.000204]\nSteps: 70%|██████▉ | 349/500 [23:37<10:27, 4.15s/it, loss=0.865, lr=0.000204]\nSteps: 70%|██████▉ | 349/500 [23:37<10:27, 4.15s/it, loss=0.806, lr=0.000202]\nSteps: 70%|███████ | 350/500 [23:39<08:40, 3.47s/it, loss=0.806, lr=0.000202]\nSteps: 70%|███████ | 350/500 [23:39<08:40, 3.47s/it, loss=0.869, lr=0.0002] \nSteps: 70%|███████ | 351/500 [23:41<07:25, 2.99s/it, loss=0.869, lr=0.0002]\nSteps: 70%|███████ | 351/500 [23:41<07:25, 2.99s/it, loss=0.812, lr=0.000198]\nSteps: 70%|███████ | 352/500 [23:43<06:33, 2.66s/it, loss=0.812, lr=0.000198]\nSteps: 70%|███████ | 352/500 [23:43<06:33, 2.66s/it, loss=1.01, lr=0.000196] \nSteps: 71%|███████ | 353/500 [23:51<10:15, 4.19s/it, loss=1.01, lr=0.000196]\nSteps: 71%|███████ | 353/500 [23:51<10:15, 4.19s/it, loss=1.01, lr=0.000194]\nSteps: 71%|███████ | 354/500 [23:53<08:29, 3.49s/it, loss=1.01, lr=0.000194]\nSteps: 71%|███████ | 354/500 [23:53<08:29, 3.49s/it, loss=0.951, lr=0.000192]\nSteps: 71%|███████ | 355/500 [23:54<07:15, 3.01s/it, loss=0.951, lr=0.000192]\nSteps: 71%|███████ | 355/500 [23:54<07:15, 3.01s/it, loss=0.849, lr=0.00019] \nSteps: 71%|███████ | 356/500 [23:56<06:23, 2.67s/it, loss=0.849, lr=0.00019]\nSteps: 71%|███████ | 356/500 [23:56<06:23, 2.67s/it, loss=1.06, lr=0.000187]\nSteps: 71%|███████▏ | 357/500 [24:04<09:51, 4.14s/it, loss=1.06, lr=0.000187]\nSteps: 71%|███████▏ | 357/500 [24:04<09:51, 4.14s/it, loss=1.03, lr=0.000185]\nSteps: 72%|███████▏ | 358/500 [24:06<08:10, 3.46s/it, loss=1.03, lr=0.000185]\nSteps: 72%|███████▏ | 358/500 [24:06<08:10, 3.46s/it, loss=0.889, lr=0.000183]\nSteps: 72%|███████▏ | 359/500 [24:08<07:00, 2.98s/it, loss=0.889, lr=0.000183]\nSteps: 72%|███████▏ | 359/500 [24:08<07:00, 2.98s/it, loss=0.818, lr=0.000181]\nSteps: 72%|███████▏ | 360/500 [24:09<06:10, 2.65s/it, loss=0.818, lr=0.000181]\nSteps: 72%|███████▏ | 360/500 [24:09<06:10, 2.65s/it, loss=1, lr=0.000179] \nSteps: 72%|███████▏ | 361/500 [24:17<09:34, 4.13s/it, loss=1, lr=0.000179]\nSteps: 72%|███████▏ | 361/500 [24:17<09:34, 4.13s/it, loss=0.996, lr=0.000177]\nSteps: 72%|███████▏ | 362/500 [24:19<07:57, 3.46s/it, loss=0.996, lr=0.000177]\nSteps: 72%|███████▏ | 362/500 [24:19<07:57, 3.46s/it, loss=1.04, lr=0.000175] \nSteps: 73%|███████▎ | 363/500 [24:21<06:48, 2.98s/it, loss=1.04, lr=0.000175]\nSteps: 73%|███████▎ | 363/500 [24:21<06:48, 2.98s/it, loss=0.784, lr=0.000173]\nSteps: 73%|███████▎ | 364/500 [24:23<06:00, 2.65s/it, loss=0.784, lr=0.000173]\nSteps: 73%|███████▎ | 364/500 [24:23<06:00, 2.65s/it, loss=0.997, lr=0.000171]\nSteps: 73%|███████▎ | 365/500 [24:30<09:22, 4.17s/it, loss=0.997, lr=0.000171]\nSteps: 73%|███████▎ | 365/500 [24:30<09:22, 4.17s/it, loss=0.794, lr=0.000169]\nSteps: 73%|███████▎ | 366/500 [24:32<07:45, 3.48s/it, loss=0.794, lr=0.000169]\nSteps: 73%|███████▎ | 366/500 [24:32<07:45, 3.48s/it, loss=0.874, lr=0.000167]\nSteps: 73%|███████▎ | 367/500 [24:34<06:38, 3.00s/it, loss=0.874, lr=0.000167]\nSteps: 73%|███████▎ | 367/500 [24:34<06:38, 3.00s/it, loss=0.848, lr=0.000165]\nSteps: 74%|███████▎ | 368/500 [24:36<05:50, 2.66s/it, loss=0.848, lr=0.000165]\nSteps: 74%|███████▎ | 368/500 [24:36<05:50, 2.66s/it, loss=0.964, lr=0.000163]\nSteps: 74%|███████▍ | 369/500 [24:44<09:07, 4.18s/it, loss=0.964, lr=0.000163]\nSteps: 74%|███████▍ | 369/500 [24:44<09:07, 4.18s/it, loss=0.778, lr=0.00016] \nSteps: 74%|███████▍ | 370/500 [24:46<07:33, 3.49s/it, loss=0.778, lr=0.00016]\nSteps: 74%|███████▍ | 370/500 [24:46<07:33, 3.49s/it, loss=1.04, lr=0.000158]\nSteps: 74%|███████▍ | 371/500 [24:47<06:27, 3.00s/it, loss=1.04, lr=0.000158]\nSteps: 74%|███████▍ | 371/500 [24:47<06:27, 3.00s/it, loss=1, lr=0.000156] \nSteps: 74%|███████▍ | 372/500 [24:49<05:41, 2.67s/it, loss=1, lr=0.000156]\nSteps: 74%|███████▍ | 372/500 [24:49<05:41, 2.67s/it, loss=0.937, lr=0.000154]\nSteps: 75%|███████▍ | 373/500 [24:57<08:47, 4.15s/it, loss=0.937, lr=0.000154]\nSteps: 75%|███████▍ | 373/500 [24:57<08:47, 4.15s/it, loss=1.05, lr=0.000152] \nSteps: 75%|███████▍ | 374/500 [24:59<07:17, 3.47s/it, loss=1.05, lr=0.000152]\nSteps: 75%|███████▍ | 374/500 [24:59<07:17, 3.47s/it, loss=0.894, lr=0.00015]\nSteps: 75%|███████▌ | 375/500 [25:01<06:13, 2.99s/it, loss=0.894, lr=0.00015]\nSteps: 75%|███████▌ | 375/500 [25:01<06:13, 2.99s/it, loss=0.821, lr=0.000148]\nSteps: 75%|███████▌ | 376/500 [25:03<05:29, 2.66s/it, loss=0.821, lr=0.000148]\nSteps: 75%|███████▌ | 376/500 [25:03<05:29, 2.66s/it, loss=1.04, lr=0.000146] \nSteps: 75%|███████▌ | 377/500 [25:10<08:34, 4.19s/it, loss=1.04, lr=0.000146]\nSteps: 75%|███████▌ | 377/500 [25:10<08:34, 4.19s/it, loss=0.978, lr=0.000144]\nSteps: 76%|███████▌ | 378/500 [25:12<07:05, 3.49s/it, loss=0.978, lr=0.000144]\nSteps: 76%|███████▌ | 378/500 [25:12<07:05, 3.49s/it, loss=0.943, lr=0.000142]\nSteps: 76%|███████▌ | 379/500 [25:14<06:03, 3.01s/it, loss=0.943, lr=0.000142]\nSteps: 76%|███████▌ | 379/500 [25:14<06:03, 3.01s/it, loss=1.05, lr=0.00014] \nSteps: 76%|███████▌ | 380/500 [25:16<05:20, 2.67s/it, loss=1.05, lr=0.00014]\nSteps: 76%|███████▌ | 380/500 [25:16<05:20, 2.67s/it, loss=0.892, lr=0.000138]\nSteps: 76%|███████▌ | 381/500 [25:24<08:12, 4.14s/it, loss=0.892, lr=0.000138]\nSteps: 76%|███████▌ | 381/500 [25:24<08:12, 4.14s/it, loss=0.82, lr=0.000136] \nSteps: 76%|███████▋ | 382/500 [25:25<06:48, 3.46s/it, loss=0.82, lr=0.000136]\nSteps: 76%|███████▋ | 382/500 [25:25<06:48, 3.46s/it, loss=1.02, lr=0.000134]\nSteps: 77%|███████▋ | 383/500 [25:27<05:49, 2.98s/it, loss=1.02, lr=0.000134]\nSteps: 77%|███████▋ | 383/500 [25:27<05:49, 2.98s/it, loss=0.785, lr=0.000132]\nSteps: 77%|███████▋ | 384/500 [25:29<05:07, 2.65s/it, loss=0.785, lr=0.000132]\nSteps: 77%|███████▋ | 384/500 [25:29<05:07, 2.65s/it, loss=0.898, lr=0.00013] \nSteps: 77%|███████▋ | 385/500 [25:37<07:56, 4.14s/it, loss=0.898, lr=0.00013]\nSteps: 77%|███████▋ | 385/500 [25:37<07:56, 4.14s/it, loss=0.836, lr=0.000128]\nSteps: 77%|███████▋ | 386/500 [25:39<06:34, 3.46s/it, loss=0.836, lr=0.000128]\nSteps: 77%|███████▋ | 386/500 [25:39<06:34, 3.46s/it, loss=0.894, lr=0.000126]\nSteps: 77%|███████▋ | 387/500 [25:41<05:37, 2.98s/it, loss=0.894, lr=0.000126]\nSteps: 77%|███████▋ | 387/500 [25:41<05:37, 2.98s/it, loss=0.776, lr=0.000124]\nSteps: 78%|███████▊ | 388/500 [25:42<04:56, 2.65s/it, loss=0.776, lr=0.000124]\nSteps: 78%|███████▊ | 388/500 [25:42<04:56, 2.65s/it, loss=1.06, lr=0.000122] \nSteps: 78%|███████▊ | 389/500 [25:50<07:38, 4.14s/it, loss=1.06, lr=0.000122]\nSteps: 78%|███████▊ | 389/500 [25:50<07:38, 4.14s/it, loss=0.79, lr=0.000121]\nSteps: 78%|███████▊ | 390/500 [25:52<06:20, 3.46s/it, loss=0.79, lr=0.000121]\nSteps: 78%|███████▊ | 390/500 [25:52<06:20, 3.46s/it, loss=0.867, lr=0.000119]\nSteps: 78%|███████▊ | 391/500 [25:54<05:25, 2.98s/it, loss=0.867, lr=0.000119]\nSteps: 78%|███████▊ | 391/500 [25:54<05:25, 2.98s/it, loss=0.79, lr=0.000117] \nSteps: 78%|███████▊ | 392/500 [25:56<04:46, 2.65s/it, loss=0.79, lr=0.000117]\nSteps: 78%|███████▊ | 392/500 [25:56<04:46, 2.65s/it, loss=0.867, lr=0.000115]\nSteps: 79%|███████▊ | 393/500 [26:03<07:21, 4.12s/it, loss=0.867, lr=0.000115]\nSteps: 79%|███████▊ | 393/500 [26:03<07:21, 4.12s/it, loss=0.818, lr=0.000113]\nSteps: 79%|███████▉ | 394/500 [26:05<06:05, 3.45s/it, loss=0.818, lr=0.000113]\nSteps: 79%|███████▉ | 394/500 [26:05<06:05, 3.45s/it, loss=0.931, lr=0.000111]\nSteps: 79%|███████▉ | 395/500 [26:07<05:12, 2.97s/it, loss=0.931, lr=0.000111]\nSteps: 79%|███████▉ | 395/500 [26:07<05:12, 2.97s/it, loss=0.821, lr=0.000109]\nSteps: 79%|███████▉ | 396/500 [26:09<04:35, 2.65s/it, loss=0.821, lr=0.000109]\nSteps: 79%|███████▉ | 396/500 [26:09<04:35, 2.65s/it, loss=0.916, lr=0.000107]\nSteps: 79%|███████▉ | 397/500 [26:17<07:10, 4.18s/it, loss=0.916, lr=0.000107]\nSteps: 79%|███████▉ | 397/500 [26:17<07:10, 4.18s/it, loss=0.805, lr=0.000105]\nSteps: 80%|███████▉ | 398/500 [26:18<05:55, 3.49s/it, loss=0.805, lr=0.000105]\nSteps: 80%|███████▉ | 398/500 [26:18<05:55, 3.49s/it, loss=1.06, lr=0.000104] \nSteps: 80%|███████▉ | 399/500 [26:20<05:03, 3.00s/it, loss=1.06, lr=0.000104]\nSteps: 80%|███████▉ | 399/500 [26:20<05:03, 3.00s/it, loss=0.812, lr=0.000102]\nSteps: 80%|████████ | 400/500 [26:22<04:26, 2.66s/it, loss=0.812, lr=0.000102]\nSteps: 80%|████████ | 400/500 [26:22<04:26, 2.66s/it, loss=0.863, lr=0.0001] \nSteps: 80%|████████ | 401/500 [26:30<06:54, 4.19s/it, loss=0.863, lr=0.0001]\nSteps: 80%|████████ | 401/500 [26:30<06:54, 4.19s/it, loss=0.843, lr=9.82e-5]\nSteps: 80%|████████ | 402/500 [26:32<05:42, 3.49s/it, loss=0.843, lr=9.82e-5]\nSteps: 80%|████████ | 402/500 [26:32<05:42, 3.49s/it, loss=0.926, lr=9.64e-5]\nSteps: 81%|████████ | 403/500 [26:34<04:51, 3.01s/it, loss=0.926, lr=9.64e-5]\nSteps: 81%|████████ | 403/500 [26:34<04:51, 3.01s/it, loss=0.953, lr=9.46e-5]\nSteps: 81%|████████ | 404/500 [26:36<04:15, 2.67s/it, loss=0.953, lr=9.46e-5]\nSteps: 81%|████████ | 404/500 [26:36<04:15, 2.67s/it, loss=1.01, lr=9.28e-5] \nSteps: 81%|████████ | 405/500 [26:43<06:40, 4.22s/it, loss=1.01, lr=9.28e-5]\nSteps: 81%|████████ | 405/500 [26:43<06:40, 4.22s/it, loss=0.825, lr=9.11e-5]\nSteps: 81%|████████ | 406/500 [26:45<05:30, 3.52s/it, loss=0.825, lr=9.11e-5]\nSteps: 81%|████████ | 406/500 [26:45<05:30, 3.52s/it, loss=0.909, lr=8.93e-5]\nSteps: 81%|████████▏ | 407/500 [26:47<04:41, 3.02s/it, loss=0.909, lr=8.93e-5]\nSteps: 81%|████████▏ | 407/500 [26:47<04:41, 3.02s/it, loss=0.781, lr=8.76e-5]\nSteps: 82%|████████▏ | 408/500 [26:49<04:06, 2.68s/it, loss=0.781, lr=8.76e-5]\nSteps: 82%|████████▏ | 408/500 [26:49<04:06, 2.68s/it, loss=0.862, lr=8.59e-5]\nSteps: 82%|████████▏ | 409/500 [26:57<06:22, 4.20s/it, loss=0.862, lr=8.59e-5]\nSteps: 82%|████████▏ | 409/500 [26:57<06:22, 4.20s/it, loss=0.822, lr=8.41e-5]\nSteps: 82%|████████▏ | 410/500 [26:59<05:15, 3.50s/it, loss=0.822, lr=8.41e-5]\nSteps: 82%|████████▏ | 410/500 [26:59<05:15, 3.50s/it, loss=1.01, lr=8.24e-5] \nSteps: 82%|████████▏ | 411/500 [27:01<04:28, 3.01s/it, loss=1.01, lr=8.24e-5]\nSteps: 82%|████████▏ | 411/500 [27:01<04:28, 3.01s/it, loss=0.829, lr=8.08e-5]\nSteps: 82%|████████▏ | 412/500 [27:02<03:55, 2.67s/it, loss=0.829, lr=8.08e-5]\nSteps: 82%|████████▏ | 412/500 [27:02<03:55, 2.67s/it, loss=0.92, lr=7.91e-5] \nSteps: 83%|████████▎ | 413/500 [27:10<06:02, 4.16s/it, loss=0.92, lr=7.91e-5]\nSteps: 83%|████████▎ | 413/500 [27:10<06:02, 4.16s/it, loss=0.789, lr=7.74e-5]\nSteps: 83%|████████▎ | 414/500 [27:12<04:59, 3.48s/it, loss=0.789, lr=7.74e-5]\nSteps: 83%|████████▎ | 414/500 [27:12<04:59, 3.48s/it, loss=0.979, lr=7.58e-5]\nSteps: 83%|████████▎ | 415/500 [27:14<04:14, 3.00s/it, loss=0.979, lr=7.58e-5]\nSteps: 83%|████████▎ | 415/500 [27:14<04:14, 3.00s/it, loss=0.963, lr=7.41e-5]\nSteps: 83%|████████▎ | 416/500 [27:16<03:43, 2.66s/it, loss=0.963, lr=7.41e-5]\nSteps: 83%|████████▎ | 416/500 [27:16<03:43, 2.66s/it, loss=0.903, lr=7.25e-5]\nSteps: 83%|████████▎ | 417/500 [27:23<05:42, 4.13s/it, loss=0.903, lr=7.25e-5]\nSteps: 83%|████████▎ | 417/500 [27:23<05:42, 4.13s/it, loss=0.803, lr=7.09e-5]\nSteps: 84%|████████▎ | 418/500 [27:25<04:42, 3.45s/it, loss=0.803, lr=7.09e-5]\nSteps: 84%|████████▎ | 418/500 [27:25<04:42, 3.45s/it, loss=0.924, lr=6.93e-5]\nSteps: 84%|████████▍ | 419/500 [27:27<04:01, 2.98s/it, loss=0.924, lr=6.93e-5]\nSteps: 84%|████████▍ | 419/500 [27:27<04:01, 2.98s/it, loss=0.769, lr=6.77e-5]\nSteps: 84%|████████▍ | 420/500 [27:29<03:31, 2.64s/it, loss=0.769, lr=6.77e-5]\nSteps: 84%|████████▍ | 420/500 [27:29<03:31, 2.64s/it, loss=0.99, lr=6.62e-5] \nSteps: 84%|████████▍ | 421/500 [27:37<05:28, 4.16s/it, loss=0.99, lr=6.62e-5]\nSteps: 84%|████████▍ | 421/500 [27:37<05:28, 4.16s/it, loss=0.778, lr=6.46e-5]\nSteps: 84%|████████▍ | 422/500 [27:38<04:31, 3.48s/it, loss=0.778, lr=6.46e-5]\nSteps: 84%|████████▍ | 422/500 [27:38<04:31, 3.48s/it, loss=0.994, lr=6.31e-5]\nSteps: 85%|████████▍ | 423/500 [27:40<03:50, 3.00s/it, loss=0.994, lr=6.31e-5]\nSteps: 85%|████████▍ | 423/500 [27:40<03:50, 3.00s/it, loss=0.845, lr=6.16e-5]\nSteps: 85%|████████▍ | 424/500 [27:42<03:22, 2.66s/it, loss=0.845, lr=6.16e-5]\nSteps: 85%|████████▍ | 424/500 [27:42<03:22, 2.66s/it, loss=0.944, lr=6.01e-5]\nSteps: 85%|████████▌ | 425/500 [27:50<05:13, 4.18s/it, loss=0.944, lr=6.01e-5]\nSteps: 85%|████████▌ | 425/500 [27:50<05:13, 4.18s/it, loss=0.771, lr=5.86e-5]\nSteps: 85%|████████▌ | 426/500 [27:52<04:17, 3.49s/it, loss=0.771, lr=5.86e-5]\nSteps: 85%|████████▌ | 426/500 [27:52<04:17, 3.49s/it, loss=0.932, lr=5.71e-5]\nSteps: 85%|████████▌ | 427/500 [27:54<03:39, 3.00s/it, loss=0.932, lr=5.71e-5]\nSteps: 85%|████████▌ | 427/500 [27:54<03:39, 3.00s/it, loss=0.771, lr=5.56e-5]\nSteps: 86%|████████▌ | 428/500 [27:55<03:11, 2.66s/it, loss=0.771, lr=5.56e-5]\nSteps: 86%|████████▌ | 428/500 [27:55<03:11, 2.66s/it, loss=0.861, lr=5.42e-5]\nSteps: 86%|████████▌ | 429/500 [28:03<04:57, 4.19s/it, loss=0.861, lr=5.42e-5]\nSteps: 86%|████████▌ | 429/500 [28:03<04:57, 4.19s/it, loss=0.836, lr=5.28e-5]\nSteps: 86%|████████▌ | 430/500 [28:05<04:04, 3.50s/it, loss=0.836, lr=5.28e-5]\nSteps: 86%|████████▌ | 430/500 [28:05<04:04, 3.50s/it, loss=0.99, lr=5.14e-5] \nSteps: 86%|████████▌ | 431/500 [28:07<03:27, 3.01s/it, loss=0.99, lr=5.14e-5]\nSteps: 86%|████████▌ | 431/500 [28:07<03:27, 3.01s/it, loss=0.804, lr=5e-5] \nSteps: 86%|████████▋ | 432/500 [28:09<03:01, 2.67s/it, loss=0.804, lr=5e-5]\nSteps: 86%|████████▋ | 432/500 [28:09<03:01, 2.67s/it, loss=0.885, lr=4.86e-5]\nSteps: 87%|████████▋ | 433/500 [28:17<04:42, 4.22s/it, loss=0.885, lr=4.86e-5]\nSteps: 87%|████████▋ | 433/500 [28:17<04:42, 4.22s/it, loss=1, lr=4.72e-5] \nSteps: 87%|████████▋ | 434/500 [28:19<03:51, 3.51s/it, loss=1, lr=4.72e-5]\nSteps: 87%|████████▋ | 434/500 [28:19<03:51, 3.51s/it, loss=0.999, lr=4.59e-5]\nSteps: 87%|████████▋ | 435/500 [28:20<03:16, 3.02s/it, loss=0.999, lr=4.59e-5]\nSteps: 87%|████████▋ | 435/500 [28:20<03:16, 3.02s/it, loss=0.774, lr=4.46e-5]\nSteps: 87%|████████▋ | 436/500 [28:22<02:51, 2.68s/it, loss=0.774, lr=4.46e-5]\nSteps: 87%|████████▋ | 436/500 [28:22<02:51, 2.68s/it, loss=0.946, lr=4.33e-5]\nSteps: 87%|████████▋ | 437/500 [28:30<04:21, 4.15s/it, loss=0.946, lr=4.33e-5]\nSteps: 87%|████████▋ | 437/500 [28:30<04:21, 4.15s/it, loss=0.841, lr=4.2e-5] \nSteps: 88%|████████▊ | 438/500 [28:32<03:34, 3.46s/it, loss=0.841, lr=4.2e-5]\nSteps: 88%|████████▊ | 438/500 [28:32<03:34, 3.46s/it, loss=1.01, lr=4.07e-5]\nSteps: 88%|████████▊ | 439/500 [28:34<03:02, 2.99s/it, loss=1.01, lr=4.07e-5]\nSteps: 88%|████████▊ | 439/500 [28:34<03:02, 2.99s/it, loss=0.882, lr=3.94e-5]\nSteps: 88%|████████▊ | 440/500 [28:36<02:39, 2.65s/it, loss=0.882, lr=3.94e-5]\nSteps: 88%|████████▊ | 440/500 [28:36<02:39, 2.65s/it, loss=0.937, lr=3.82e-5]\nSteps: 88%|████████▊ | 441/500 [28:44<04:11, 4.26s/it, loss=0.937, lr=3.82e-5]\nSteps: 88%|████████▊ | 441/500 [28:44<04:11, 4.26s/it, loss=0.851, lr=3.7e-5] \nSteps: 88%|████████▊ | 442/500 [28:45<03:25, 3.54s/it, loss=0.851, lr=3.7e-5]\nSteps: 88%|████████▊ | 442/500 [28:45<03:25, 3.54s/it, loss=0.866, lr=3.58e-5]\nSteps: 89%|████████▊ | 443/500 [28:47<02:53, 3.04s/it, loss=0.866, lr=3.58e-5]\nSteps: 89%|████████▊ | 443/500 [28:47<02:53, 3.04s/it, loss=0.918, lr=3.46e-5]\nSteps: 89%|████████▉ | 444/500 [28:49<02:30, 2.69s/it, loss=0.918, lr=3.46e-5]\nSteps: 89%|████████▉ | 444/500 [28:49<02:30, 2.69s/it, loss=0.892, lr=3.34e-5]\nSteps: 89%|████████▉ | 445/500 [28:57<03:49, 4.18s/it, loss=0.892, lr=3.34e-5]\nSteps: 89%|████████▉ | 445/500 [28:57<03:49, 4.18s/it, loss=0.787, lr=3.23e-5]\nSteps: 89%|████████▉ | 446/500 [28:59<03:08, 3.49s/it, loss=0.787, lr=3.23e-5]\nSteps: 89%|████████▉ | 446/500 [28:59<03:08, 3.49s/it, loss=0.89, lr=3.11e-5] \nSteps: 89%|████████▉ | 447/500 [29:01<02:39, 3.00s/it, loss=0.89, lr=3.11e-5]\nSteps: 89%|████████▉ | 447/500 [29:01<02:39, 3.00s/it, loss=1.05, lr=3e-5] \nSteps: 90%|████████▉ | 448/500 [29:02<02:18, 2.66s/it, loss=1.05, lr=3e-5]\nSteps: 90%|████████▉ | 448/500 [29:02<02:18, 2.66s/it, loss=1.01, lr=2.89e-5]\nSteps: 90%|████████▉ | 449/500 [29:10<03:31, 4.15s/it, loss=1.01, lr=2.89e-5]\nSteps: 90%|████████▉ | 449/500 [29:10<03:31, 4.15s/it, loss=0.801, lr=2.79e-5]\nSteps: 90%|█████████ | 450/500 [29:12<02:53, 3.47s/it, loss=0.801, lr=2.79e-5]\nSteps: 90%|█████████ | 450/500 [29:12<02:53, 3.47s/it, loss=0.908, lr=2.68e-5]\nSteps: 90%|█████████ | 451/500 [29:14<02:26, 2.99s/it, loss=0.908, lr=2.68e-5]\nSteps: 90%|█████████ | 451/500 [29:14<02:26, 2.99s/it, loss=0.758, lr=2.58e-5]\nSteps: 90%|█████████ | 452/500 [29:16<02:07, 2.65s/it, loss=0.758, lr=2.58e-5]\nSteps: 90%|█████████ | 452/500 [29:16<02:07, 2.65s/it, loss=0.872, lr=2.47e-5]\nSteps: 91%|█████████ | 453/500 [29:23<03:15, 4.15s/it, loss=0.872, lr=2.47e-5]\nSteps: 91%|█████████ | 453/500 [29:23<03:15, 4.15s/it, loss=0.784, lr=2.37e-5]\nSteps: 91%|█████████ | 454/500 [29:25<02:39, 3.47s/it, loss=0.784, lr=2.37e-5]\nSteps: 91%|█████████ | 454/500 [29:25<02:39, 3.47s/it, loss=0.86, lr=2.28e-5] \nSteps: 91%|█████████ | 455/500 [29:27<02:14, 2.99s/it, loss=0.86, lr=2.28e-5]\nSteps: 91%|█████████ | 455/500 [29:27<02:14, 2.99s/it, loss=0.839, lr=2.18e-5]\nSteps: 91%|█████████ | 456/500 [29:29<01:56, 2.66s/it, loss=0.839, lr=2.18e-5]\nSteps: 91%|█████████ | 456/500 [29:29<01:56, 2.66s/it, loss=1, lr=2.09e-5] \nSteps: 91%|█████████▏| 457/500 [29:37<02:59, 4.18s/it, loss=1, lr=2.09e-5]\nSteps: 91%|█████████▏| 457/500 [29:37<02:59, 4.18s/it, loss=1.05, lr=1.99e-5]\nSteps: 92%|█████████▏| 458/500 [29:39<02:26, 3.49s/it, loss=1.05, lr=1.99e-5]\nSteps: 92%|█████████▏| 458/500 [29:39<02:26, 3.49s/it, loss=0.987, lr=1.9e-5]\nSteps: 92%|█████████▏| 459/500 [29:40<02:03, 3.00s/it, loss=0.987, lr=1.9e-5]\nSteps: 92%|█████████▏| 459/500 [29:40<02:03, 3.00s/it, loss=0.831, lr=1.82e-5]\nSteps: 92%|█████████▏| 460/500 [29:42<01:46, 2.66s/it, loss=0.831, lr=1.82e-5]\nSteps: 92%|█████████▏| 460/500 [29:42<01:46, 2.66s/it, loss=0.996, lr=1.73e-5]\nSteps: 92%|█████████▏| 461/500 [29:50<02:42, 4.17s/it, loss=0.996, lr=1.73e-5]\nSteps: 92%|█████████▏| 461/500 [29:50<02:42, 4.17s/it, loss=0.805, lr=1.64e-5]\nSteps: 92%|█████████▏| 462/500 [29:52<02:12, 3.48s/it, loss=0.805, lr=1.64e-5]\nSteps: 92%|█████████▏| 462/500 [29:52<02:12, 3.48s/it, loss=1.05, lr=1.56e-5] \nSteps: 93%|█████████▎| 463/500 [29:54<01:50, 3.00s/it, loss=1.05, lr=1.56e-5]\nSteps: 93%|█████████▎| 463/500 [29:54<01:50, 3.00s/it, loss=0.837, lr=1.48e-5]\nSteps: 93%|█████████▎| 464/500 [29:56<01:35, 2.66s/it, loss=0.837, lr=1.48e-5]\nSteps: 93%|█████████▎| 464/500 [29:56<01:35, 2.66s/it, loss=0.87, lr=1.4e-5] \nSteps: 93%|█████████▎| 465/500 [30:03<02:24, 4.13s/it, loss=0.87, lr=1.4e-5]\nSteps: 93%|█████████▎| 465/500 [30:03<02:24, 4.13s/it, loss=0.973, lr=1.33e-5]\nSteps: 93%|█████████▎| 466/500 [30:05<01:57, 3.45s/it, loss=0.973, lr=1.33e-5]\nSteps: 93%|█████████▎| 466/500 [30:05<01:57, 3.45s/it, loss=0.984, lr=1.25e-5]\nSteps: 93%|█████████▎| 467/500 [30:07<01:38, 2.98s/it, loss=0.984, lr=1.25e-5]\nSteps: 93%|█████████▎| 467/500 [30:07<01:38, 2.98s/it, loss=0.84, lr=1.18e-5] \nSteps: 94%|█████████▎| 468/500 [30:09<01:24, 2.65s/it, loss=0.84, lr=1.18e-5]\nSteps: 94%|█████████▎| 468/500 [30:09<01:24, 2.65s/it, loss=0.917, lr=1.11e-5]\nSteps: 94%|█████████▍| 469/500 [30:16<02:08, 4.15s/it, loss=0.917, lr=1.11e-5]\nSteps: 94%|█████████▍| 469/500 [30:16<02:08, 4.15s/it, loss=0.838, lr=1.04e-5]\nSteps: 94%|█████████▍| 470/500 [30:18<01:43, 3.46s/it, loss=0.838, lr=1.04e-5]\nSteps: 94%|█████████▍| 470/500 [30:18<01:43, 3.46s/it, loss=0.949, lr=9.79e-6]\nSteps: 94%|█████████▍| 471/500 [30:20<01:26, 2.99s/it, loss=0.949, lr=9.79e-6]\nSteps: 94%|█████████▍| 471/500 [30:20<01:26, 2.99s/it, loss=0.809, lr=9.15e-6]\nSteps: 94%|█████████▍| 472/500 [30:22<01:14, 2.65s/it, loss=0.809, lr=9.15e-6]\nSteps: 94%|█████████▍| 472/500 [30:22<01:14, 2.65s/it, loss=1.05, lr=8.54e-6] \nSteps: 95%|█████████▍| 473/500 [30:30<01:52, 4.17s/it, loss=1.05, lr=8.54e-6]\nSteps: 95%|█████████▍| 473/500 [30:30<01:52, 4.17s/it, loss=0.781, lr=7.94e-6]\nSteps: 95%|█████████▍| 474/500 [30:32<01:30, 3.48s/it, loss=0.781, lr=7.94e-6]\nSteps: 95%|█████████▍| 474/500 [30:32<01:30, 3.48s/it, loss=1.02, lr=7.37e-6] \nSteps: 95%|█████████▌| 475/500 [30:33<01:14, 3.00s/it, loss=1.02, lr=7.37e-6]\nSteps: 95%|█████████▌| 475/500 [30:33<01:14, 3.00s/it, loss=0.77, lr=6.81e-6]\nSteps: 95%|█████████▌| 476/500 [30:35<01:03, 2.66s/it, loss=0.77, lr=6.81e-6]\nSteps: 95%|█████████▌| 476/500 [30:35<01:03, 2.66s/it, loss=0.964, lr=6.28e-6]\nSteps: 95%|█████████▌| 477/500 [30:43<01:35, 4.16s/it, loss=0.964, lr=6.28e-6]\nSteps: 95%|█████████▌| 477/500 [30:43<01:35, 4.16s/it, loss=0.813, lr=5.77e-6]\nSteps: 96%|█████████▌| 478/500 [30:45<01:16, 3.47s/it, loss=0.813, lr=5.77e-6]\nSteps: 96%|█████████▌| 478/500 [30:45<01:16, 3.47s/it, loss=1.06, lr=5.28e-6] \nSteps: 96%|█████████▌| 479/500 [30:47<01:02, 3.00s/it, loss=1.06, lr=5.28e-6]\nSteps: 96%|█████████▌| 479/500 [30:47<01:02, 3.00s/it, loss=0.959, lr=4.82e-6]\nSteps: 96%|█████████▌| 480/500 [30:49<00:53, 2.66s/it, loss=0.959, lr=4.82e-6]\nSteps: 96%|█████████▌| 480/500 [30:49<00:53, 2.66s/it, loss=0.859, lr=4.37e-6]\nSteps: 96%|█████████▌| 481/500 [30:56<01:19, 4.17s/it, loss=0.859, lr=4.37e-6]\nSteps: 96%|█████████▌| 481/500 [30:56<01:19, 4.17s/it, loss=1.03, lr=3.95e-6] \nSteps: 96%|█████████▋| 482/500 [30:58<01:02, 3.48s/it, loss=1.03, lr=3.95e-6]\nSteps: 96%|█████████▋| 482/500 [30:58<01:02, 3.48s/it, loss=1.07, lr=3.54e-6]\nSteps: 97%|█████████▋| 483/500 [31:00<00:50, 3.00s/it, loss=1.07, lr=3.54e-6]\nSteps: 97%|█████████▋| 483/500 [31:00<00:50, 3.00s/it, loss=0.791, lr=3.16e-6]\nSteps: 97%|█████████▋| 484/500 [31:02<00:42, 2.66s/it, loss=0.791, lr=3.16e-6]\nSteps: 97%|█████████▋| 484/500 [31:02<00:42, 2.66s/it, loss=0.934, lr=2.8e-6] \nSteps: 97%|█████████▋| 485/500 [31:10<01:02, 4.15s/it, loss=0.934, lr=2.8e-6]\nSteps: 97%|█████████▋| 485/500 [31:10<01:02, 4.15s/it, loss=0.777, lr=2.46e-6]\nSteps: 97%|█████████▋| 486/500 [31:11<00:48, 3.47s/it, loss=0.777, lr=2.46e-6]\nSteps: 97%|█████████▋| 486/500 [31:11<00:48, 3.47s/it, loss=0.957, lr=2.15e-6]\nSteps: 97%|█████████▋| 487/500 [31:13<00:38, 2.99s/it, loss=0.957, lr=2.15e-6]\nSteps: 97%|█████████▋| 487/500 [31:13<00:38, 2.99s/it, loss=1.04, lr=1.85e-6] \nSteps: 98%|█████████▊| 488/500 [31:15<00:31, 2.65s/it, loss=1.04, lr=1.85e-6]\nSteps: 98%|█████████▊| 488/500 [31:15<00:31, 2.65s/it, loss=1.01, lr=1.58e-6]\nSteps: 98%|█████████▊| 489/500 [31:23<00:45, 4.12s/it, loss=1.01, lr=1.58e-6]\nSteps: 98%|█████████▊| 489/500 [31:23<00:45, 4.12s/it, loss=0.779, lr=1.33e-6]\nSteps: 98%|█████████▊| 490/500 [31:25<00:34, 3.45s/it, loss=0.779, lr=1.33e-6]\nSteps: 98%|█████████▊| 490/500 [31:25<00:34, 3.45s/it, loss=0.955, lr=1.1e-6] \nSteps: 98%|█████████▊| 491/500 [31:26<00:26, 2.97s/it, loss=0.955, lr=1.1e-6]\nSteps: 98%|█████████▊| 491/500 [31:26<00:26, 2.97s/it, loss=0.839, lr=8.88e-7]\nSteps: 98%|█████████▊| 492/500 [31:28<00:21, 2.64s/it, loss=0.839, lr=8.88e-7]\nSteps: 98%|█████████▊| 492/500 [31:28<00:21, 2.64s/it, loss=1.06, lr=7.01e-7] \nSteps: 99%|█████████▊| 493/500 [31:36<00:29, 4.19s/it, loss=1.06, lr=7.01e-7]\nSteps: 99%|█████████▊| 493/500 [31:36<00:29, 4.19s/it, loss=0.975, lr=5.37e-7]\nSteps: 99%|█████████▉| 494/500 [31:38<00:20, 3.49s/it, loss=0.975, lr=5.37e-7]\nSteps: 99%|█████████▉| 494/500 [31:38<00:20, 3.49s/it, loss=0.987, lr=3.95e-7]\nSteps: 99%|█████████▉| 495/500 [31:40<00:15, 3.01s/it, loss=0.987, lr=3.95e-7]\nSteps: 99%|█████████▉| 495/500 [31:40<00:15, 3.01s/it, loss=0.783, lr=2.74e-7]\nSteps: 99%|█████████▉| 496/500 [31:42<00:10, 2.67s/it, loss=0.783, lr=2.74e-7]\nSteps: 99%|█████████▉| 496/500 [31:42<00:10, 2.67s/it, loss=0.941, lr=1.75e-7]\nSteps: 99%|█████████▉| 497/500 [31:49<00:12, 4.18s/it, loss=0.941, lr=1.75e-7]\nSteps: 99%|█████████▉| 497/500 [31:49<00:12, 4.18s/it, loss=0.78, lr=9.87e-8] \nSteps: 100%|█████████▉| 498/500 [31:51<00:06, 3.49s/it, loss=0.78, lr=9.87e-8]\nSteps: 100%|█████████▉| 498/500 [31:51<00:06, 3.49s/it, loss=0.928, lr=4.39e-8]\nSteps: 100%|█████████▉| 499/500 [31:53<00:03, 3.00s/it, loss=0.928, lr=4.39e-8]\nSteps: 100%|█████████▉| 499/500 [31:53<00:03, 3.00s/it, loss=1.03, lr=1.1e-8] \nSteps: 100%|██████████| 500/500 [31:55<00:00, 2.66s/it, loss=1.03, lr=1.1e-8]\nSteps: 100%|██████████| 500/500 [31:55<00:00, 2.66s/it, loss=0.906, lr=0] \nSteps: 100%|██████████| 500/500 [31:59<00:00, 3.84s/it, loss=0.906, lr=0]\n---Tar up output directory---\nmochi-lora/\nmochi-lora/pytorch_lora_weights.safetensors\nUploading to Hugging Face: lucataco/mochi-lora-vhs\nHF Repo URL: https://huggingface.co/lucataco/mochi-lora-vhs\npytorch_lora_weights.safetensors: 0%| | 0.00/76.1M [00:00<?, ?B/s]\npytorch_lora_weights.safetensors: 10%|▉ | 7.34M/76.1M [00:00<00:00, 73.4MB/s]\npytorch_lora_weights.safetensors: 21%|██ | 16.0M/76.1M [00:00<00:01, 42.3MB/s]\npytorch_lora_weights.safetensors: 42%|████▏ | 32.0M/76.1M [00:00<00:00, 46.4MB/s]\npytorch_lora_weights.safetensors: 63%|██████▎ | 48.0M/76.1M [00:00<00:00, 54.6MB/s]\npytorch_lora_weights.safetensors: 84%|████████▍ | 64.0M/76.1M [00:01<00:00, 57.3MB/s]\npytorch_lora_weights.safetensors: 100%|██████████| 76.1M/76.1M [00:01<00:00, 54.6MB/s]\nSuccessfully uploaded model to https://huggingface.co/lucataco/mochi-lora-vhs",
"metrics": {
"predict_time": 1939.37619092,
"total_time": 1939.383131
},
"output": {
"weights": "https://replicate.delivery/xezq/eWTxCE13svWocSUepK0rWZdxJhthKmFtpE2SzDBfrSVdcxznA/trained_model.tar"
},
"started_at": "2024-12-11T17:55:06.819940Z",
"status": "succeeded",
"urls": {
"get": "https://api.replicate.com/v1/predictions/mxey6b9vqnrm80ckpveaqjqb9w",
"cancel": "https://api.replicate.com/v1/predictions/mxey6b9vqnrm80ckpveaqjqb9w/cancel"
},
"version": "170ea99fb48a30fef98cb1c9fb403a2882ab9d60c2ba15ad9383ace33c3fa385"
}
Cleaning up previous runs
Extracted 8 files from zip to videos_input
---Starting to Trim input videos---
Processing: videos_input/vhs1.mp4
Copied videos_input/vhs1.txt to videos_prepared/vhs1.txt
Moviepy - Building video videos_prepared/vhs1.mp4.
Moviepy - Writing video videos_prepared/vhs1.mp4
0%| | 0/4 [00:00<?, ?it/s]
0%| | 0/4 [00:00<?, ?it/s]
0%| | 0/4 [00:00<?, ?it/s]
t: 0%| | 0/40 [00:00<?, ?it/s, now=None]
0%| | 0/4 [00:00<?, ?it/s]
Moviepy - Done !
Moviepy - video ready videos_prepared/vhs1.mp4
0%| | 0/4 [00:00<?, ?it/s]
Processing: videos_input/vhs2.mp4
Copied videos_input/vhs2.txt to videos_prepared/vhs2.txt
Moviepy - Building video videos_prepared/vhs2.mp4.
Moviepy - Writing video videos_prepared/vhs2.mp4
25%|██▌ | 1/4 [00:00<00:00, 3.16it/s]
25%|██▌ | 1/4 [00:00<00:00, 3.16it/s]
25%|██▌ | 1/4 [00:00<00:00, 3.16it/s]
t: 0%| | 0/40 [00:00<?, ?it/s, now=None]
25%|██▌ | 1/4 [00:00<00:00, 3.16it/s]
Moviepy - Done !
Moviepy - video ready videos_prepared/vhs2.mp4
25%|██▌ | 1/4 [00:00<00:00, 3.16it/s]
Processing: videos_input/vhs3.mp4
Copied videos_input/vhs3.txt to videos_prepared/vhs3.txt
Moviepy - Building video videos_prepared/vhs3.mp4.
50%|█████ | 2/4 [00:00<00:00, 3.05it/s]
50%|█████ | 2/4 [00:00<00:00, 3.05it/s]
Moviepy - Writing video videos_prepared/vhs3.mp4
50%|█████ | 2/4 [00:00<00:00, 3.05it/s]
t: 0%| | 0/40 [00:00<?, ?it/s, now=None]
50%|█████ | 2/4 [00:00<00:00, 3.05it/s]
Moviepy - Done !
Moviepy - video ready videos_prepared/vhs3.mp4
50%|█████ | 2/4 [00:00<00:00, 3.05it/s]
Processing: videos_input/vhs4.mp4
Copied videos_input/vhs4.txt to videos_prepared/vhs4.txt
Moviepy - Building video videos_prepared/vhs4.mp4.
Moviepy - Writing video videos_prepared/vhs4.mp4
75%|███████▌ | 3/4 [00:00<00:00, 3.05it/s]
75%|███████▌ | 3/4 [00:01<00:00, 3.05it/s]
75%|███████▌ | 3/4 [00:01<00:00, 3.05it/s]
t: 0%| | 0/40 [00:00<?, ?it/s, now=None]
75%|███████▌ | 3/4 [00:01<00:00, 3.05it/s]
Moviepy - Done !
Moviepy - video ready videos_prepared/vhs4.mp4
75%|███████▌ | 3/4 [00:01<00:00, 3.05it/s]
100%|██████████| 4/4 [00:01<00:00, 3.07it/s]
100%|██████████| 4/4 [00:01<00:00, 3.07it/s]
---Starting to Embed videos---
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards: 50%|█████ | 1/2 [00:00<00:00, 1.67it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00, 1.78it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00, 1.76it/s]
Loading pipeline components...: 0%| | 0/3 [00:00<?, ?it/s]
Loading pipeline components...: 100%|██████████| 3/3 [00:00<00:00, 681.59it/s]
Processing videos_prepared/vhs1.mp4
Trimmed video from 40 to first 37 frames
0it [00:00, ?it/s]
Processing videos_prepared/vhs2.mp4
Trimmed video from 40 to first 37 frames
1it [00:01, 1.38s/it]
Processing videos_prepared/vhs3.mp4
Trimmed video from 40 to first 37 frames
2it [00:02, 1.15s/it]
Processing videos_prepared/vhs4.mp4
Trimmed video from 40 to first 37 frames
3it [00:03, 1.07s/it]
4it [00:04, 1.03s/it]
4it [00:04, 1.08s/it]
---Starting training---
Found 4 training videos in videos_prepared
Loaded 4/4 valid file pairs.
===== Memory before training =====
memory_allocated=18.780 GB
max_memory_allocated=18.780 GB
max_memory_reserved=19.250 GB
***** Running training *****
Num trainable parameters = 19005440
Num examples = 4
Num batches each epoch = 4
Num epochs = 125
Instantaneous batch size per device = 1
Total train batch size (w. parallel, distributed & accumulation) = 1
Total optimization steps = 500
Steps: 0%| | 0/500 [00:00<?, ?it/s]W1211 17:57:31.075000 135012609543680 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.
W1211 17:57:31.089000 135012609543680 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.
W1211 17:57:31.224000 135012609543680 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.
Steps: 0%| | 1/500 [04:16<35:33:30, 256.53s/it]
Steps: 0%| | 1/500 [04:16<35:33:30, 256.53s/it, loss=0.933, lr=2e-6]
Steps: 0%| | 2/500 [04:18<14:45:44, 106.72s/it, loss=0.933, lr=2e-6]
Steps: 0%| | 2/500 [04:18<14:45:44, 106.72s/it, loss=1.05, lr=4e-6]
Steps: 1%| | 3/500 [04:20<8:07:22, 58.84s/it, loss=1.05, lr=4e-6]
Steps: 1%| | 3/500 [04:20<8:07:22, 58.84s/it, loss=0.864, lr=6e-6]
Steps: 1%| | 4/500 [04:22<5:00:27, 36.34s/it, loss=0.864, lr=6e-6]
Steps: 1%| | 4/500 [04:22<5:00:27, 36.34s/it, loss=1.06, lr=8e-6]
Steps: 1%| | 5/500 [04:29<3:34:15, 25.97s/it, loss=1.06, lr=8e-6]
Steps: 1%| | 5/500 [04:29<3:34:15, 25.97s/it, loss=0.874, lr=1e-5]
Steps: 1%| | 6/500 [04:31<2:26:21, 17.78s/it, loss=0.874, lr=1e-5]
Steps: 1%| | 6/500 [04:31<2:26:21, 17.78s/it, loss=1.02, lr=1.2e-5]
Steps: 1%|▏ | 7/500 [04:33<1:43:19, 12.58s/it, loss=1.02, lr=1.2e-5]
Steps: 1%|▏ | 7/500 [04:33<1:43:19, 12.58s/it, loss=0.902, lr=1.4e-5]
Steps: 2%|▏ | 8/500 [04:35<1:15:09, 9.17s/it, loss=0.902, lr=1.4e-5]
Steps: 2%|▏ | 8/500 [04:35<1:15:09, 9.17s/it, loss=1.08, lr=1.6e-5]
Steps: 2%|▏ | 9/500 [04:42<1:11:01, 8.68s/it, loss=1.08, lr=1.6e-5]
Steps: 2%|▏ | 9/500 [04:42<1:11:01, 8.68s/it, loss=1.04, lr=1.8e-5]
Steps: 2%|▏ | 10/500 [04:44<53:42, 6.58s/it, loss=1.04, lr=1.8e-5]
Steps: 2%|▏ | 10/500 [04:44<53:42, 6.58s/it, loss=1.09, lr=2e-5]
Steps: 2%|▏ | 11/500 [04:46<41:50, 5.13s/it, loss=1.09, lr=2e-5]
Steps: 2%|▏ | 11/500 [04:46<41:50, 5.13s/it, loss=0.886, lr=2.2e-5]
Steps: 2%|▏ | 12/500 [04:48<33:40, 4.14s/it, loss=0.886, lr=2.2e-5]
Steps: 2%|▏ | 12/500 [04:48<33:40, 4.14s/it, loss=1.1, lr=2.4e-5]
Steps: 3%|▎ | 13/500 [04:56<42:08, 5.19s/it, loss=1.1, lr=2.4e-5]
Steps: 3%|▎ | 13/500 [04:56<42:08, 5.19s/it, loss=0.881, lr=2.6e-5]
Steps: 3%|▎ | 14/500 [04:57<33:55, 4.19s/it, loss=0.881, lr=2.6e-5]
Steps: 3%|▎ | 14/500 [04:57<33:55, 4.19s/it, loss=1.07, lr=2.8e-5]
Steps: 3%|▎ | 15/500 [04:59<28:12, 3.49s/it, loss=1.07, lr=2.8e-5]
Steps: 3%|▎ | 15/500 [04:59<28:12, 3.49s/it, loss=0.79, lr=3e-5]
Steps: 3%|▎ | 16/500 [05:01<24:13, 3.00s/it, loss=0.79, lr=3e-5]
Steps: 3%|▎ | 16/500 [05:01<24:13, 3.00s/it, loss=1.07, lr=3.2e-5]
Steps: 3%|▎ | 17/500 [05:09<35:16, 4.38s/it, loss=1.07, lr=3.2e-5]
Steps: 3%|▎ | 17/500 [05:09<35:16, 4.38s/it, loss=0.873, lr=3.4e-5]
Steps: 4%|▎ | 18/500 [05:11<29:08, 3.63s/it, loss=0.873, lr=3.4e-5]
Steps: 4%|▎ | 18/500 [05:11<29:08, 3.63s/it, loss=0.968, lr=3.6e-5]
Steps: 4%|▍ | 19/500 [05:13<24:50, 3.10s/it, loss=0.968, lr=3.6e-5]
Steps: 4%|▍ | 19/500 [05:13<24:50, 3.10s/it, loss=0.979, lr=3.8e-5]
Steps: 4%|▍ | 20/500 [05:14<21:50, 2.73s/it, loss=0.979, lr=3.8e-5]
Steps: 4%|▍ | 20/500 [05:14<21:50, 2.73s/it, loss=1.08, lr=4e-5]
Steps: 4%|▍ | 21/500 [05:22<33:40, 4.22s/it, loss=1.08, lr=4e-5]
Steps: 4%|▍ | 21/500 [05:22<33:40, 4.22s/it, loss=0.866, lr=4.2e-5]
Steps: 4%|▍ | 22/500 [05:24<27:59, 3.51s/it, loss=0.866, lr=4.2e-5]
Steps: 4%|▍ | 22/500 [05:24<27:59, 3.51s/it, loss=0.966, lr=4.4e-5]
Steps: 5%|▍ | 23/500 [05:26<24:01, 3.02s/it, loss=0.966, lr=4.4e-5]
Steps: 5%|▍ | 23/500 [05:26<24:01, 3.02s/it, loss=0.849, lr=4.6e-5]
Steps: 5%|▍ | 24/500 [05:28<21:13, 2.68s/it, loss=0.849, lr=4.6e-5]
Steps: 5%|▍ | 24/500 [05:28<21:13, 2.68s/it, loss=1.07, lr=4.8e-5]
Steps: 5%|▌ | 25/500 [05:35<33:08, 4.19s/it, loss=1.07, lr=4.8e-5]
Steps: 5%|▌ | 25/500 [05:35<33:08, 4.19s/it, loss=0.853, lr=5e-5]
Steps: 5%|▌ | 26/500 [05:37<27:34, 3.49s/it, loss=0.853, lr=5e-5]
Steps: 5%|▌ | 26/500 [05:37<27:34, 3.49s/it, loss=0.996, lr=5.2e-5]
Steps: 5%|▌ | 27/500 [05:39<23:40, 3.00s/it, loss=0.996, lr=5.2e-5]
Steps: 5%|▌ | 27/500 [05:39<23:40, 3.00s/it, loss=0.879, lr=5.4e-5]
Steps: 6%|▌ | 28/500 [05:41<20:56, 2.66s/it, loss=0.879, lr=5.4e-5]
Steps: 6%|▌ | 28/500 [05:41<20:56, 2.66s/it, loss=0.977, lr=5.6e-5]
Steps: 6%|▌ | 29/500 [05:49<32:28, 4.14s/it, loss=0.977, lr=5.6e-5]
Steps: 6%|▌ | 29/500 [05:49<32:28, 4.14s/it, loss=0.881, lr=5.8e-5]
Steps: 6%|▌ | 30/500 [05:50<27:04, 3.46s/it, loss=0.881, lr=5.8e-5]
Steps: 6%|▌ | 30/500 [05:50<27:04, 3.46s/it, loss=1.06, lr=6e-5]
Steps: 6%|▌ | 31/500 [05:52<23:18, 2.98s/it, loss=1.06, lr=6e-5]
Steps: 6%|▌ | 31/500 [05:52<23:18, 2.98s/it, loss=1.05, lr=6.2e-5]
Steps: 6%|▋ | 32/500 [05:54<20:39, 2.65s/it, loss=1.05, lr=6.2e-5]
Steps: 6%|▋ | 32/500 [05:54<20:39, 2.65s/it, loss=0.985, lr=6.4e-5]
Steps: 7%|▋ | 33/500 [06:02<32:21, 4.16s/it, loss=0.985, lr=6.4e-5]
Steps: 7%|▋ | 33/500 [06:02<32:21, 4.16s/it, loss=0.871, lr=6.6e-5]
Steps: 7%|▋ | 34/500 [06:04<26:57, 3.47s/it, loss=0.871, lr=6.6e-5]
Steps: 7%|▋ | 34/500 [06:04<26:57, 3.47s/it, loss=1.04, lr=6.8e-5]
Steps: 7%|▋ | 35/500 [06:06<23:10, 2.99s/it, loss=1.04, lr=6.8e-5]
Steps: 7%|▋ | 35/500 [06:06<23:10, 2.99s/it, loss=0.829, lr=7e-5]
Steps: 7%|▋ | 36/500 [06:08<20:32, 2.66s/it, loss=0.829, lr=7e-5]
Steps: 7%|▋ | 36/500 [06:08<20:32, 2.66s/it, loss=0.963, lr=7.2e-5]
Steps: 7%|▋ | 37/500 [06:15<32:12, 4.17s/it, loss=0.963, lr=7.2e-5]
Steps: 7%|▋ | 37/500 [06:15<32:12, 4.17s/it, loss=0.878, lr=7.4e-5]
Steps: 8%|▊ | 38/500 [06:17<26:49, 3.48s/it, loss=0.878, lr=7.4e-5]
Steps: 8%|▊ | 38/500 [06:17<26:49, 3.48s/it, loss=1.03, lr=7.6e-5]
Steps: 8%|▊ | 39/500 [06:19<23:02, 3.00s/it, loss=1.03, lr=7.6e-5]
Steps: 8%|▊ | 39/500 [06:19<23:02, 3.00s/it, loss=0.886, lr=7.8e-5]
Steps: 8%|▊ | 40/500 [06:21<20:24, 2.66s/it, loss=0.886, lr=7.8e-5]
Steps: 8%|▊ | 40/500 [06:21<20:24, 2.66s/it, loss=1.06, lr=8e-5]
Steps: 8%|▊ | 41/500 [06:28<31:38, 4.14s/it, loss=1.06, lr=8e-5]
Steps: 8%|▊ | 41/500 [06:28<31:38, 4.14s/it, loss=0.874, lr=8.2e-5]
Steps: 8%|▊ | 42/500 [06:30<26:23, 3.46s/it, loss=0.874, lr=8.2e-5]
Steps: 8%|▊ | 42/500 [06:30<26:23, 3.46s/it, loss=1.07, lr=8.4e-5]
Steps: 9%|▊ | 43/500 [06:32<22:43, 2.98s/it, loss=1.07, lr=8.4e-5]
Steps: 9%|▊ | 43/500 [06:32<22:43, 2.98s/it, loss=0.911, lr=8.6e-5]
Steps: 9%|▉ | 44/500 [06:34<20:07, 2.65s/it, loss=0.911, lr=8.6e-5]
Steps: 9%|▉ | 44/500 [06:34<20:07, 2.65s/it, loss=1.05, lr=8.8e-5]
Steps: 9%|▉ | 45/500 [06:42<31:30, 4.15s/it, loss=1.05, lr=8.8e-5]
Steps: 9%|▉ | 45/500 [06:42<31:30, 4.15s/it, loss=0.874, lr=9e-5]
Steps: 9%|▉ | 46/500 [06:44<26:15, 3.47s/it, loss=0.874, lr=9e-5]
Steps: 9%|▉ | 46/500 [06:44<26:15, 3.47s/it, loss=1.06, lr=9.2e-5]
Steps: 9%|▉ | 47/500 [06:45<22:34, 2.99s/it, loss=1.06, lr=9.2e-5]
Steps: 9%|▉ | 47/500 [06:45<22:34, 2.99s/it, loss=0.833, lr=9.4e-5]
Steps: 10%|▉ | 48/500 [06:47<20:00, 2.66s/it, loss=0.833, lr=9.4e-5]
Steps: 10%|▉ | 48/500 [06:47<20:00, 2.66s/it, loss=0.973, lr=9.6e-5]
Steps: 10%|▉ | 49/500 [06:55<31:31, 4.19s/it, loss=0.973, lr=9.6e-5]
Steps: 10%|▉ | 49/500 [06:55<31:31, 4.19s/it, loss=0.883, lr=9.8e-5]
Steps: 10%|█ | 50/500 [06:57<26:13, 3.50s/it, loss=0.883, lr=9.8e-5]
Steps: 10%|█ | 50/500 [06:57<26:13, 3.50s/it, loss=1.08, lr=0.0001]
Steps: 10%|█ | 51/500 [06:59<22:31, 3.01s/it, loss=1.08, lr=0.0001]
Steps: 10%|█ | 51/500 [06:59<22:31, 3.01s/it, loss=0.826, lr=0.000102]
Steps: 10%|█ | 52/500 [07:01<19:56, 2.67s/it, loss=0.826, lr=0.000102]
Steps: 10%|█ | 52/500 [07:01<19:56, 2.67s/it, loss=0.939, lr=0.000104]
Steps: 11%|█ | 53/500 [07:11<37:37, 5.05s/it, loss=0.939, lr=0.000104]
Steps: 11%|█ | 53/500 [07:11<37:37, 5.05s/it, loss=0.789, lr=0.000106]
Steps: 11%|█ | 54/500 [07:13<30:27, 4.10s/it, loss=0.789, lr=0.000106]
Steps: 11%|█ | 54/500 [07:13<30:27, 4.10s/it, loss=1.05, lr=0.000108]
Steps: 11%|█ | 55/500 [07:15<25:25, 3.43s/it, loss=1.05, lr=0.000108]
Steps: 11%|█ | 55/500 [07:15<25:25, 3.43s/it, loss=1.05, lr=0.00011]
Steps: 11%|█ | 56/500 [07:17<21:55, 2.96s/it, loss=1.05, lr=0.00011]
Steps: 11%|█ | 56/500 [07:17<21:55, 2.96s/it, loss=0.958, lr=0.000112]
Steps: 11%|█▏ | 57/500 [07:24<31:59, 4.33s/it, loss=0.958, lr=0.000112]
Steps: 11%|█▏ | 57/500 [07:24<31:59, 4.33s/it, loss=0.842, lr=0.000114]
Steps: 12%|█▏ | 58/500 [07:26<26:28, 3.59s/it, loss=0.842, lr=0.000114]
Steps: 12%|█▏ | 58/500 [07:26<26:28, 3.59s/it, loss=0.939, lr=0.000116]
Steps: 12%|█▏ | 59/500 [07:28<22:36, 3.08s/it, loss=0.939, lr=0.000116]
Steps: 12%|█▏ | 59/500 [07:28<22:36, 3.08s/it, loss=0.882, lr=0.000118]
Steps: 12%|█▏ | 60/500 [07:30<19:54, 2.71s/it, loss=0.882, lr=0.000118]
Steps: 12%|█▏ | 60/500 [07:30<19:54, 2.71s/it, loss=0.952, lr=0.00012]
Steps: 12%|█▏ | 61/500 [07:38<30:27, 4.16s/it, loss=0.952, lr=0.00012]
Steps: 12%|█▏ | 61/500 [07:38<30:27, 4.16s/it, loss=1.05, lr=0.000122]
Steps: 12%|█▏ | 62/500 [07:40<25:22, 3.48s/it, loss=1.05, lr=0.000122]
Steps: 12%|█▏ | 62/500 [07:40<25:22, 3.48s/it, loss=0.985, lr=0.000124]
Steps: 13%|█▎ | 63/500 [07:41<21:48, 3.00s/it, loss=0.985, lr=0.000124]
Steps: 13%|█▎ | 63/500 [07:41<21:48, 3.00s/it, loss=0.816, lr=0.000126]
Steps: 13%|█▎ | 64/500 [07:43<19:18, 2.66s/it, loss=0.816, lr=0.000126]
Steps: 13%|█▎ | 64/500 [07:43<19:18, 2.66s/it, loss=1.02, lr=0.000128]
Steps: 13%|█▎ | 65/500 [07:51<30:04, 4.15s/it, loss=1.02, lr=0.000128]
Steps: 13%|█▎ | 65/500 [07:51<30:04, 4.15s/it, loss=0.855, lr=0.00013]
Steps: 13%|█▎ | 66/500 [07:53<25:03, 3.47s/it, loss=0.855, lr=0.00013]
Steps: 13%|█▎ | 66/500 [07:53<25:03, 3.47s/it, loss=0.947, lr=0.000132]
Steps: 13%|█▎ | 67/500 [07:55<21:33, 2.99s/it, loss=0.947, lr=0.000132]
Steps: 13%|█▎ | 67/500 [07:55<21:33, 2.99s/it, loss=0.879, lr=0.000134]
Steps: 14%|█▎ | 68/500 [07:56<19:05, 2.65s/it, loss=0.879, lr=0.000134]
Steps: 14%|█▎ | 68/500 [07:57<19:05, 2.65s/it, loss=1.06, lr=0.000136]
Steps: 14%|█▍ | 69/500 [08:04<30:06, 4.19s/it, loss=1.06, lr=0.000136]
Steps: 14%|█▍ | 69/500 [08:04<30:06, 4.19s/it, loss=0.825, lr=0.000138]
Steps: 14%|█▍ | 70/500 [08:06<25:02, 3.50s/it, loss=0.825, lr=0.000138]
Steps: 14%|█▍ | 70/500 [08:06<25:02, 3.50s/it, loss=0.924, lr=0.00014]
Steps: 14%|█▍ | 71/500 [08:08<21:30, 3.01s/it, loss=0.924, lr=0.00014]
Steps: 14%|█▍ | 71/500 [08:08<21:30, 3.01s/it, loss=0.794, lr=0.000142]
Steps: 14%|█▍ | 72/500 [08:10<19:01, 2.67s/it, loss=0.794, lr=0.000142]
Steps: 14%|█▍ | 72/500 [08:10<19:01, 2.67s/it, loss=0.978, lr=0.000144]
Steps: 15%|█▍ | 73/500 [08:18<29:45, 4.18s/it, loss=0.978, lr=0.000144]
Steps: 15%|█▍ | 73/500 [08:18<29:45, 4.18s/it, loss=0.996, lr=0.000146]
Steps: 15%|█▍ | 74/500 [08:19<24:46, 3.49s/it, loss=0.996, lr=0.000146]
Steps: 15%|█▍ | 74/500 [08:19<24:46, 3.49s/it, loss=1.07, lr=0.000148]
Steps: 15%|█▌ | 75/500 [08:21<21:16, 3.00s/it, loss=1.07, lr=0.000148]
Steps: 15%|█▌ | 75/500 [08:21<21:16, 3.00s/it, loss=0.842, lr=0.00015]
Steps: 15%|█▌ | 76/500 [08:23<18:50, 2.67s/it, loss=0.842, lr=0.00015]
Steps: 15%|█▌ | 76/500 [08:23<18:50, 2.67s/it, loss=0.946, lr=0.000152]
Steps: 15%|█▌ | 77/500 [08:31<29:24, 4.17s/it, loss=0.946, lr=0.000152]
Steps: 15%|█▌ | 77/500 [08:31<29:24, 4.17s/it, loss=0.838, lr=0.000154]
Steps: 16%|█▌ | 78/500 [08:33<24:29, 3.48s/it, loss=0.838, lr=0.000154]
Steps: 16%|█▌ | 78/500 [08:33<24:29, 3.48s/it, loss=1.06, lr=0.000156]
Steps: 16%|█▌ | 79/500 [08:35<21:02, 3.00s/it, loss=1.06, lr=0.000156]
Steps: 16%|█▌ | 79/500 [08:35<21:02, 3.00s/it, loss=0.85, lr=0.000158]
Steps: 16%|█▌ | 80/500 [08:37<18:37, 2.66s/it, loss=0.85, lr=0.000158]
Steps: 16%|█▌ | 80/500 [08:37<18:37, 2.66s/it, loss=0.923, lr=0.00016]
Steps: 16%|█▌ | 81/500 [08:44<29:10, 4.18s/it, loss=0.923, lr=0.00016]
Steps: 16%|█▌ | 81/500 [08:44<29:10, 4.18s/it, loss=0.764, lr=0.000162]
Steps: 16%|█▋ | 82/500 [08:46<24:17, 3.49s/it, loss=0.764, lr=0.000162]
Steps: 16%|█▋ | 82/500 [08:46<24:17, 3.49s/it, loss=0.94, lr=0.000164]
Steps: 17%|█▋ | 83/500 [08:48<20:51, 3.00s/it, loss=0.94, lr=0.000164]
Steps: 17%|█▋ | 83/500 [08:48<20:51, 3.00s/it, loss=0.828, lr=0.000166]
Steps: 17%|█▋ | 84/500 [08:50<18:27, 2.66s/it, loss=0.828, lr=0.000166]
Steps: 17%|█▋ | 84/500 [08:50<18:27, 2.66s/it, loss=1.02, lr=0.000168]
Steps: 17%|█▋ | 85/500 [08:58<28:59, 4.19s/it, loss=1.02, lr=0.000168]
Steps: 17%|█▋ | 85/500 [08:58<28:59, 4.19s/it, loss=0.991, lr=0.00017]
Steps: 17%|█▋ | 86/500 [08:59<24:07, 3.50s/it, loss=0.991, lr=0.00017]
Steps: 17%|█▋ | 86/500 [09:00<24:07, 3.50s/it, loss=0.975, lr=0.000172]
Steps: 17%|█▋ | 87/500 [09:01<20:42, 3.01s/it, loss=0.975, lr=0.000172]
Steps: 17%|█▋ | 87/500 [09:01<20:42, 3.01s/it, loss=0.814, lr=0.000174]
Steps: 18%|█▊ | 88/500 [09:03<18:19, 2.67s/it, loss=0.814, lr=0.000174]
Steps: 18%|█▊ | 88/500 [09:03<18:19, 2.67s/it, loss=1.07, lr=0.000176]
Steps: 18%|█▊ | 89/500 [09:11<28:25, 4.15s/it, loss=1.07, lr=0.000176]
Steps: 18%|█▊ | 89/500 [09:11<28:25, 4.15s/it, loss=0.859, lr=0.000178]
Steps: 18%|█▊ | 90/500 [09:13<23:41, 3.47s/it, loss=0.859, lr=0.000178]
Steps: 18%|█▊ | 90/500 [09:13<23:41, 3.47s/it, loss=1.06, lr=0.00018]
Steps: 18%|█▊ | 91/500 [09:15<20:21, 2.99s/it, loss=1.06, lr=0.00018]
Steps: 18%|█▊ | 91/500 [09:15<20:21, 2.99s/it, loss=0.825, lr=0.000182]
Steps: 18%|█▊ | 92/500 [09:16<18:02, 2.65s/it, loss=0.825, lr=0.000182]
Steps: 18%|█▊ | 92/500 [09:16<18:02, 2.65s/it, loss=0.954, lr=0.000184]
Steps: 19%|█▊ | 93/500 [09:24<28:00, 4.13s/it, loss=0.954, lr=0.000184]
Steps: 19%|█▊ | 93/500 [09:24<28:00, 4.13s/it, loss=0.852, lr=0.000186]
Steps: 19%|█▉ | 94/500 [09:26<23:21, 3.45s/it, loss=0.852, lr=0.000186]
Steps: 19%|█▉ | 94/500 [09:26<23:21, 3.45s/it, loss=1.04, lr=0.000188]
Steps: 19%|█▉ | 95/500 [09:28<20:06, 2.98s/it, loss=1.04, lr=0.000188]
Steps: 19%|█▉ | 95/500 [09:28<20:06, 2.98s/it, loss=0.847, lr=0.00019]
Steps: 19%|█▉ | 96/500 [09:30<17:49, 2.65s/it, loss=0.847, lr=0.00019]
Steps: 19%|█▉ | 96/500 [09:30<17:49, 2.65s/it, loss=0.921, lr=0.000192]
Steps: 19%|█▉ | 97/500 [09:37<27:56, 4.16s/it, loss=0.921, lr=0.000192]
Steps: 19%|█▉ | 97/500 [09:37<27:56, 4.16s/it, loss=0.873, lr=0.000194]
Steps: 20%|█▉ | 98/500 [09:39<23:16, 3.47s/it, loss=0.873, lr=0.000194]
Steps: 20%|█▉ | 98/500 [09:39<23:16, 3.47s/it, loss=0.977, lr=0.000196]
Steps: 20%|█▉ | 99/500 [09:41<20:00, 2.99s/it, loss=0.977, lr=0.000196]
Steps: 20%|█▉ | 99/500 [09:41<20:00, 2.99s/it, loss=0.851, lr=0.000198]
Steps: 20%|██ | 100/500 [09:43<17:44, 2.66s/it, loss=0.851, lr=0.000198]
Steps: 20%|██ | 100/500 [09:43<17:44, 2.66s/it, loss=0.918, lr=0.0002]
Steps: 20%|██ | 101/500 [09:51<28:24, 4.27s/it, loss=0.918, lr=0.0002]
Steps: 20%|██ | 101/500 [09:51<28:24, 4.27s/it, loss=0.809, lr=0.000202]
Steps: 20%|██ | 102/500 [09:53<23:33, 3.55s/it, loss=0.809, lr=0.000202]
Steps: 20%|██ | 102/500 [09:53<23:33, 3.55s/it, loss=0.916, lr=0.000204]
Steps: 21%|██ | 103/500 [09:55<20:10, 3.05s/it, loss=0.916, lr=0.000204]
Steps: 21%|██ | 103/500 [09:55<20:10, 3.05s/it, loss=1.01, lr=0.000206]
Steps: 21%|██ | 104/500 [09:57<17:48, 2.70s/it, loss=1.01, lr=0.000206]
Steps: 21%|██ | 104/500 [09:57<17:48, 2.70s/it, loss=0.958, lr=0.000208]
Steps: 21%|██ | 105/500 [10:05<28:03, 4.26s/it, loss=0.958, lr=0.000208]
Steps: 21%|██ | 105/500 [10:05<28:03, 4.26s/it, loss=0.807, lr=0.00021]
Steps: 21%|██ | 106/500 [10:06<23:16, 3.55s/it, loss=0.807, lr=0.00021]
Steps: 21%|██ | 106/500 [10:06<23:16, 3.55s/it, loss=0.953, lr=0.000212]
Steps: 21%|██▏ | 107/500 [10:08<19:56, 3.04s/it, loss=0.953, lr=0.000212]
Steps: 21%|██▏ | 107/500 [10:08<19:56, 3.04s/it, loss=0.826, lr=0.000214]
Steps: 22%|██▏ | 108/500 [10:10<17:35, 2.69s/it, loss=0.826, lr=0.000214]
Steps: 22%|██▏ | 108/500 [10:10<17:35, 2.69s/it, loss=1.08, lr=0.000216]
Steps: 22%|██▏ | 109/500 [10:18<27:22, 4.20s/it, loss=1.08, lr=0.000216]
Steps: 22%|██▏ | 109/500 [10:18<27:22, 4.20s/it, loss=0.836, lr=0.000218]
Steps: 22%|██▏ | 110/500 [10:20<22:46, 3.50s/it, loss=0.836, lr=0.000218]
Steps: 22%|██▏ | 110/500 [10:20<22:46, 3.50s/it, loss=1.07, lr=0.00022]
Steps: 22%|██▏ | 111/500 [10:22<19:32, 3.01s/it, loss=1.07, lr=0.00022]
Steps: 22%|██▏ | 111/500 [10:22<19:32, 3.01s/it, loss=0.824, lr=0.000222]
Steps: 22%|██▏ | 112/500 [10:24<17:16, 2.67s/it, loss=0.824, lr=0.000222]
Steps: 22%|██▏ | 112/500 [10:24<17:16, 2.67s/it, loss=0.916, lr=0.000224]
Steps: 23%|██▎ | 113/500 [10:31<26:58, 4.18s/it, loss=0.916, lr=0.000224]
Steps: 23%|██▎ | 113/500 [10:31<26:58, 4.18s/it, loss=0.793, lr=0.000226]
Steps: 23%|██▎ | 114/500 [10:33<22:27, 3.49s/it, loss=0.793, lr=0.000226]
Steps: 23%|██▎ | 114/500 [10:33<22:27, 3.49s/it, loss=0.927, lr=0.000228]
Steps: 23%|██▎ | 115/500 [10:35<19:17, 3.01s/it, loss=0.927, lr=0.000228]
Steps: 23%|██▎ | 115/500 [10:35<19:17, 3.01s/it, loss=0.924, lr=0.00023]
Steps: 23%|██▎ | 116/500 [10:37<17:03, 2.67s/it, loss=0.924, lr=0.00023]
Steps: 23%|██▎ | 116/500 [10:37<17:03, 2.67s/it, loss=1.04, lr=0.000232]
Steps: 23%|██▎ | 117/500 [10:44<26:32, 4.16s/it, loss=1.04, lr=0.000232]
Steps: 23%|██▎ | 117/500 [10:44<26:32, 4.16s/it, loss=0.857, lr=0.000234]
Steps: 24%|██▎ | 118/500 [10:46<22:06, 3.47s/it, loss=0.857, lr=0.000234]
Steps: 24%|██▎ | 118/500 [10:46<22:06, 3.47s/it, loss=0.91, lr=0.000236]
Steps: 24%|██▍ | 119/500 [10:48<19:00, 2.99s/it, loss=0.91, lr=0.000236]
Steps: 24%|██▍ | 119/500 [10:48<19:00, 2.99s/it, loss=0.781, lr=0.000238]
Steps: 24%|██▍ | 120/500 [10:50<16:49, 2.66s/it, loss=0.781, lr=0.000238]
Steps: 24%|██▍ | 120/500 [10:50<16:49, 2.66s/it, loss=0.937, lr=0.00024]
Steps: 24%|██▍ | 121/500 [10:58<26:42, 4.23s/it, loss=0.937, lr=0.00024]
Steps: 24%|██▍ | 121/500 [10:58<26:42, 4.23s/it, loss=0.876, lr=0.000242]
Steps: 24%|██▍ | 122/500 [11:00<22:10, 3.52s/it, loss=0.876, lr=0.000242]
Steps: 24%|██▍ | 122/500 [11:00<22:10, 3.52s/it, loss=0.971, lr=0.000244]
Steps: 25%|██▍ | 123/500 [11:02<19:00, 3.03s/it, loss=0.971, lr=0.000244]
Steps: 25%|██▍ | 123/500 [11:02<19:00, 3.03s/it, loss=0.812, lr=0.000246]
Steps: 25%|██▍ | 124/500 [11:04<16:47, 2.68s/it, loss=0.812, lr=0.000246]
Steps: 25%|██▍ | 124/500 [11:04<16:47, 2.68s/it, loss=1, lr=0.000248]
Steps: 25%|██▌ | 125/500 [11:11<26:07, 4.18s/it, loss=1, lr=0.000248]
Steps: 25%|██▌ | 125/500 [11:11<26:07, 4.18s/it, loss=0.97, lr=0.00025]
Steps: 25%|██▌ | 126/500 [11:13<21:44, 3.49s/it, loss=0.97, lr=0.00025]
Steps: 25%|██▌ | 126/500 [11:13<21:44, 3.49s/it, loss=1.07, lr=0.000252]
Steps: 25%|██▌ | 127/500 [11:15<18:40, 3.00s/it, loss=1.07, lr=0.000252]
Steps: 25%|██▌ | 127/500 [11:15<18:40, 3.00s/it, loss=0.814, lr=0.000254]
Steps: 26%|██▌ | 128/500 [11:17<16:31, 2.67s/it, loss=0.814, lr=0.000254]
Steps: 26%|██▌ | 128/500 [11:17<16:31, 2.67s/it, loss=0.904, lr=0.000256]
Steps: 26%|██▌ | 129/500 [11:25<25:55, 4.19s/it, loss=0.904, lr=0.000256]
Steps: 26%|██▌ | 129/500 [11:25<25:55, 4.19s/it, loss=0.885, lr=0.000258]
Steps: 26%|██▌ | 130/500 [11:27<21:33, 3.50s/it, loss=0.885, lr=0.000258]
Steps: 26%|██▌ | 130/500 [11:27<21:33, 3.50s/it, loss=0.923, lr=0.00026]
Steps: 26%|██▌ | 131/500 [11:28<18:30, 3.01s/it, loss=0.923, lr=0.00026]
Steps: 26%|██▌ | 131/500 [11:28<18:30, 3.01s/it, loss=0.812, lr=0.000262]
Steps: 26%|██▋ | 132/500 [11:30<16:21, 2.67s/it, loss=0.812, lr=0.000262]
Steps: 26%|██▋ | 132/500 [11:30<16:21, 2.67s/it, loss=0.986, lr=0.000264]
Steps: 27%|██▋ | 133/500 [11:38<25:53, 4.23s/it, loss=0.986, lr=0.000264]
Steps: 27%|██▋ | 133/500 [11:38<25:53, 4.23s/it, loss=0.823, lr=0.000266]
Steps: 27%|██▋ | 134/500 [11:40<21:31, 3.53s/it, loss=0.823, lr=0.000266]
Steps: 27%|██▋ | 134/500 [11:40<21:31, 3.53s/it, loss=1.06, lr=0.000268]
Steps: 27%|██▋ | 135/500 [11:42<18:26, 3.03s/it, loss=1.06, lr=0.000268]
Steps: 27%|██▋ | 135/500 [11:42<18:26, 3.03s/it, loss=1.07, lr=0.00027]
Steps: 27%|██▋ | 136/500 [11:44<16:17, 2.69s/it, loss=1.07, lr=0.00027]
Steps: 27%|██▋ | 136/500 [11:44<16:17, 2.69s/it, loss=0.961, lr=0.000272]
Steps: 27%|██▋ | 137/500 [11:51<25:19, 4.19s/it, loss=0.961, lr=0.000272]
Steps: 27%|██▋ | 137/500 [11:52<25:19, 4.19s/it, loss=0.842, lr=0.000274]
Steps: 28%|██▊ | 138/500 [11:53<21:04, 3.49s/it, loss=0.842, lr=0.000274]
Steps: 28%|██▊ | 138/500 [11:53<21:04, 3.49s/it, loss=0.952, lr=0.000276]
Steps: 28%|██▊ | 139/500 [11:55<18:05, 3.01s/it, loss=0.952, lr=0.000276]
Steps: 28%|██▊ | 139/500 [11:55<18:05, 3.01s/it, loss=0.901, lr=0.000278]
Steps: 28%|██▊ | 140/500 [11:57<16:00, 2.67s/it, loss=0.901, lr=0.000278]
Steps: 28%|██▊ | 140/500 [11:57<16:00, 2.67s/it, loss=0.926, lr=0.00028]
Steps: 28%|██▊ | 141/500 [12:05<25:06, 4.20s/it, loss=0.926, lr=0.00028]
Steps: 28%|██▊ | 141/500 [12:05<25:06, 4.20s/it, loss=0.808, lr=0.000282]
Steps: 28%|██▊ | 142/500 [12:07<20:52, 3.50s/it, loss=0.808, lr=0.000282]
Steps: 28%|██▊ | 142/500 [12:07<20:52, 3.50s/it, loss=0.926, lr=0.000284]
Steps: 29%|██▊ | 143/500 [12:09<17:55, 3.01s/it, loss=0.926, lr=0.000284]
Steps: 29%|██▊ | 143/500 [12:09<17:55, 3.01s/it, loss=0.951, lr=0.000286]
Steps: 29%|██▉ | 144/500 [12:11<15:51, 2.67s/it, loss=0.951, lr=0.000286]
Steps: 29%|██▉ | 144/500 [12:11<15:51, 2.67s/it, loss=0.911, lr=0.000288]
Steps: 29%|██▉ | 145/500 [12:18<24:43, 4.18s/it, loss=0.911, lr=0.000288]
Steps: 29%|██▉ | 145/500 [12:18<24:43, 4.18s/it, loss=0.806, lr=0.00029]
Steps: 29%|██▉ | 146/500 [12:20<20:34, 3.49s/it, loss=0.806, lr=0.00029]
Steps: 29%|██▉ | 146/500 [12:20<20:34, 3.49s/it, loss=0.901, lr=0.000292]
Steps: 29%|██▉ | 147/500 [12:22<17:40, 3.00s/it, loss=0.901, lr=0.000292]
Steps: 29%|██▉ | 147/500 [12:22<17:40, 3.00s/it, loss=0.847, lr=0.000294]
Steps: 30%|██▉ | 148/500 [12:24<15:38, 2.67s/it, loss=0.847, lr=0.000294]
Steps: 30%|██▉ | 148/500 [12:24<15:38, 2.67s/it, loss=0.963, lr=0.000296]
Steps: 30%|██▉ | 149/500 [12:32<24:33, 4.20s/it, loss=0.963, lr=0.000296]
Steps: 30%|██▉ | 149/500 [12:32<24:33, 4.20s/it, loss=1, lr=0.000298]
Steps: 30%|███ | 150/500 [12:33<20:24, 3.50s/it, loss=1, lr=0.000298]
Steps: 30%|███ | 150/500 [12:33<20:24, 3.50s/it, loss=0.897, lr=0.0003]
Steps: 30%|███ | 151/500 [12:35<17:30, 3.01s/it, loss=0.897, lr=0.0003]
Steps: 30%|███ | 151/500 [12:35<17:30, 3.01s/it, loss=0.842, lr=0.000302]
Steps: 30%|███ | 152/500 [12:37<15:28, 2.67s/it, loss=0.842, lr=0.000302]
Steps: 30%|███ | 152/500 [12:37<15:28, 2.67s/it, loss=1.07, lr=0.000304]
Steps: 31%|███ | 153/500 [12:45<24:13, 4.19s/it, loss=1.07, lr=0.000304]
Steps: 31%|███ | 153/500 [12:45<24:13, 4.19s/it, loss=0.861, lr=0.000306]
Steps: 31%|███ | 154/500 [12:47<20:09, 3.49s/it, loss=0.861, lr=0.000306]
Steps: 31%|███ | 154/500 [12:47<20:09, 3.49s/it, loss=0.903, lr=0.000308]
Steps: 31%|███ | 155/500 [12:49<17:17, 3.01s/it, loss=0.903, lr=0.000308]
Steps: 31%|███ | 155/500 [12:49<17:17, 3.01s/it, loss=0.86, lr=0.00031]
Steps: 31%|███ | 156/500 [12:51<15:18, 2.67s/it, loss=0.86, lr=0.00031]
Steps: 31%|███ | 156/500 [12:51<15:18, 2.67s/it, loss=0.904, lr=0.000312]
Steps: 31%|███▏ | 157/500 [12:58<24:02, 4.21s/it, loss=0.904, lr=0.000312]
Steps: 31%|███▏ | 157/500 [12:58<24:02, 4.21s/it, loss=1.05, lr=0.000314]
Steps: 32%|███▏ | 158/500 [13:00<19:58, 3.51s/it, loss=1.05, lr=0.000314]
Steps: 32%|███▏ | 158/500 [13:00<19:58, 3.51s/it, loss=1.02, lr=0.000316]
Steps: 32%|███▏ | 159/500 [13:02<17:08, 3.02s/it, loss=1.02, lr=0.000316]
Steps: 32%|███▏ | 159/500 [13:02<17:08, 3.02s/it, loss=0.964, lr=0.000318]
Steps: 32%|███▏ | 160/500 [13:04<15:08, 2.67s/it, loss=0.964, lr=0.000318]
Steps: 32%|███▏ | 160/500 [13:04<15:08, 2.67s/it, loss=0.909, lr=0.00032]
Steps: 32%|███▏ | 161/500 [13:12<23:30, 4.16s/it, loss=0.909, lr=0.00032]
Steps: 32%|███▏ | 161/500 [13:12<23:30, 4.16s/it, loss=0.874, lr=0.000322]
Steps: 32%|███▏ | 162/500 [13:13<19:34, 3.47s/it, loss=0.874, lr=0.000322]
Steps: 32%|███▏ | 162/500 [13:14<19:34, 3.47s/it, loss=0.932, lr=0.000324]
Steps: 33%|███▎ | 163/500 [13:15<16:49, 2.99s/it, loss=0.932, lr=0.000324]
Steps: 33%|███▎ | 163/500 [13:15<16:49, 2.99s/it, loss=0.917, lr=0.000326]
Steps: 33%|███▎ | 164/500 [13:17<14:53, 2.66s/it, loss=0.917, lr=0.000326]
Steps: 33%|███▎ | 164/500 [13:17<14:53, 2.66s/it, loss=1.07, lr=0.000328]
Steps: 33%|███▎ | 165/500 [13:25<23:19, 4.18s/it, loss=1.07, lr=0.000328]
Steps: 33%|███▎ | 165/500 [13:25<23:19, 4.18s/it, loss=0.855, lr=0.00033]
Steps: 33%|███▎ | 166/500 [13:27<19:24, 3.49s/it, loss=0.855, lr=0.00033]
Steps: 33%|███▎ | 166/500 [13:27<19:24, 3.49s/it, loss=0.986, lr=0.000332]
Steps: 33%|███▎ | 167/500 [13:29<16:39, 3.00s/it, loss=0.986, lr=0.000332]
Steps: 33%|███▎ | 167/500 [13:29<16:39, 3.00s/it, loss=0.814, lr=0.000334]
Steps: 34%|███▎ | 168/500 [13:31<14:44, 2.66s/it, loss=0.814, lr=0.000334]
Steps: 34%|███▎ | 168/500 [13:31<14:44, 2.66s/it, loss=0.92, lr=0.000336]
Steps: 34%|███▍ | 169/500 [13:38<23:13, 4.21s/it, loss=0.92, lr=0.000336]
Steps: 34%|███▍ | 169/500 [13:38<23:13, 4.21s/it, loss=0.835, lr=0.000338]
Steps: 34%|███▍ | 170/500 [13:40<19:17, 3.51s/it, loss=0.835, lr=0.000338]
Steps: 34%|███▍ | 170/500 [13:40<19:17, 3.51s/it, loss=1.08, lr=0.00034]
Steps: 34%|███▍ | 171/500 [13:42<16:33, 3.02s/it, loss=1.08, lr=0.00034]
Steps: 34%|███▍ | 171/500 [13:42<16:33, 3.02s/it, loss=0.988, lr=0.000342]
Steps: 34%|███▍ | 172/500 [13:44<14:37, 2.68s/it, loss=0.988, lr=0.000342]
Steps: 34%|███▍ | 172/500 [13:44<14:37, 2.68s/it, loss=1, lr=0.000344]
Steps: 35%|███▍ | 173/500 [13:52<22:40, 4.16s/it, loss=1, lr=0.000344]
Steps: 35%|███▍ | 173/500 [13:52<22:40, 4.16s/it, loss=1.04, lr=0.000346]
Steps: 35%|███▍ | 174/500 [13:54<18:52, 3.47s/it, loss=1.04, lr=0.000346]
Steps: 35%|███▍ | 174/500 [13:54<18:52, 3.47s/it, loss=1.05, lr=0.000348]
Steps: 35%|███▌ | 175/500 [13:55<16:13, 2.99s/it, loss=1.05, lr=0.000348]
Steps: 35%|███▌ | 175/500 [13:55<16:13, 2.99s/it, loss=0.996, lr=0.00035]
Steps: 35%|███▌ | 176/500 [13:57<14:20, 2.66s/it, loss=0.996, lr=0.00035]
Steps: 35%|███▌ | 176/500 [13:57<14:20, 2.66s/it, loss=1.06, lr=0.000352]
Steps: 35%|███▌ | 177/500 [14:05<22:27, 4.17s/it, loss=1.06, lr=0.000352]
Steps: 35%|███▌ | 177/500 [14:05<22:27, 4.17s/it, loss=0.994, lr=0.000354]
Steps: 36%|███▌ | 178/500 [14:07<18:41, 3.48s/it, loss=0.994, lr=0.000354]
Steps: 36%|███▌ | 178/500 [14:07<18:41, 3.48s/it, loss=0.987, lr=0.000356]
Steps: 36%|███▌ | 179/500 [14:09<16:02, 3.00s/it, loss=0.987, lr=0.000356]
Steps: 36%|███▌ | 179/500 [14:09<16:02, 3.00s/it, loss=0.81, lr=0.000358]
Steps: 36%|███▌ | 180/500 [14:11<14:11, 2.66s/it, loss=0.81, lr=0.000358]
Steps: 36%|███▌ | 180/500 [14:11<14:11, 2.66s/it, loss=0.944, lr=0.00036]
Steps: 36%|███▌ | 181/500 [14:18<22:05, 4.16s/it, loss=0.944, lr=0.00036]
Steps: 36%|███▌ | 181/500 [14:18<22:05, 4.16s/it, loss=0.856, lr=0.000362]
Steps: 36%|███▋ | 182/500 [14:20<18:23, 3.47s/it, loss=0.856, lr=0.000362]
Steps: 36%|███▋ | 182/500 [14:20<18:23, 3.47s/it, loss=0.956, lr=0.000364]
Steps: 37%|███▋ | 183/500 [14:22<15:48, 2.99s/it, loss=0.956, lr=0.000364]
Steps: 37%|███▋ | 183/500 [14:22<15:48, 2.99s/it, loss=0.823, lr=0.000366]
Steps: 37%|███▋ | 184/500 [14:24<13:59, 2.66s/it, loss=0.823, lr=0.000366]
Steps: 37%|███▋ | 184/500 [14:24<13:59, 2.66s/it, loss=0.963, lr=0.000368]
Steps: 37%|███▋ | 185/500 [14:31<21:45, 4.15s/it, loss=0.963, lr=0.000368]
Steps: 37%|███▋ | 185/500 [14:31<21:45, 4.15s/it, loss=0.971, lr=0.00037]
Steps: 37%|███▋ | 186/500 [14:33<18:07, 3.46s/it, loss=0.971, lr=0.00037]
Steps: 37%|███▋ | 186/500 [14:33<18:07, 3.46s/it, loss=1.01, lr=0.000372]
Steps: 37%|███▋ | 187/500 [14:35<15:34, 2.99s/it, loss=1.01, lr=0.000372]
Steps: 37%|███▋ | 187/500 [14:35<15:34, 2.99s/it, loss=0.855, lr=0.000374]
Steps: 38%|███▊ | 188/500 [14:37<13:47, 2.65s/it, loss=0.855, lr=0.000374]
Steps: 38%|███▊ | 188/500 [14:37<13:47, 2.65s/it, loss=1.06, lr=0.000376]
Steps: 38%|███▊ | 189/500 [14:45<21:33, 4.16s/it, loss=1.06, lr=0.000376]
Steps: 38%|███▊ | 189/500 [14:45<21:33, 4.16s/it, loss=0.906, lr=0.000378]
Steps: 38%|███▊ | 190/500 [14:47<17:56, 3.47s/it, loss=0.906, lr=0.000378]
Steps: 38%|███▊ | 190/500 [14:47<17:56, 3.47s/it, loss=0.957, lr=0.00038]
Steps: 38%|███▊ | 191/500 [14:49<15:24, 2.99s/it, loss=0.957, lr=0.00038]
Steps: 38%|███▊ | 191/500 [14:49<15:24, 2.99s/it, loss=0.874, lr=0.000382]
Steps: 38%|███▊ | 192/500 [14:50<13:38, 2.66s/it, loss=0.874, lr=0.000382]
Steps: 38%|███▊ | 192/500 [14:50<13:38, 2.66s/it, loss=0.902, lr=0.000384]
Steps: 39%|███▊ | 193/500 [14:58<21:24, 4.19s/it, loss=0.902, lr=0.000384]
Steps: 39%|███▊ | 193/500 [14:58<21:24, 4.19s/it, loss=1.06, lr=0.000386]
Steps: 39%|███▉ | 194/500 [15:00<17:48, 3.49s/it, loss=1.06, lr=0.000386]
Steps: 39%|███▉ | 194/500 [15:00<17:48, 3.49s/it, loss=0.955, lr=0.000388]
Steps: 39%|███▉ | 195/500 [15:02<15:16, 3.01s/it, loss=0.955, lr=0.000388]
Steps: 39%|███▉ | 195/500 [15:02<15:16, 3.01s/it, loss=0.808, lr=0.00039]
Steps: 39%|███▉ | 196/500 [15:04<13:30, 2.66s/it, loss=0.808, lr=0.00039]
Steps: 39%|███▉ | 196/500 [15:04<13:30, 2.66s/it, loss=0.925, lr=0.000392]
Steps: 39%|███▉ | 197/500 [15:12<21:19, 4.22s/it, loss=0.925, lr=0.000392]
Steps: 39%|███▉ | 197/500 [15:12<21:19, 4.22s/it, loss=0.869, lr=0.000394]
Steps: 40%|███▉ | 198/500 [15:13<17:42, 3.52s/it, loss=0.869, lr=0.000394]
Steps: 40%|███▉ | 198/500 [15:13<17:42, 3.52s/it, loss=1.08, lr=0.000396]
Steps: 40%|███▉ | 199/500 [15:15<15:10, 3.02s/it, loss=1.08, lr=0.000396]
Steps: 40%|███▉ | 199/500 [15:15<15:10, 3.02s/it, loss=0.829, lr=0.000398]
Steps: 40%|████ | 200/500 [15:17<13:23, 2.68s/it, loss=0.829, lr=0.000398]
Steps: 40%|████ | 200/500 [15:17<13:23, 2.68s/it, loss=1.05, lr=0.0004]
Steps: 40%|████ | 201/500 [15:25<20:55, 4.20s/it, loss=1.05, lr=0.0004]
Steps: 40%|████ | 201/500 [15:25<20:55, 4.20s/it, loss=1.03, lr=0.0004]
Steps: 40%|████ | 202/500 [15:27<17:23, 3.50s/it, loss=1.03, lr=0.0004]
Steps: 40%|████ | 202/500 [15:27<17:23, 3.50s/it, loss=1.06, lr=0.0004]
Steps: 41%|████ | 203/500 [15:29<14:54, 3.01s/it, loss=1.06, lr=0.0004]
Steps: 41%|████ | 203/500 [15:29<14:54, 3.01s/it, loss=0.846, lr=0.0004]
Steps: 41%|████ | 204/500 [15:31<13:10, 2.67s/it, loss=0.846, lr=0.0004]
Steps: 41%|████ | 204/500 [15:31<13:10, 2.67s/it, loss=0.921, lr=0.0004]
Steps: 41%|████ | 205/500 [15:38<20:37, 4.19s/it, loss=0.921, lr=0.0004]
Steps: 41%|████ | 205/500 [15:38<20:37, 4.19s/it, loss=0.856, lr=0.0004]
Steps: 41%|████ | 206/500 [15:40<17:08, 3.50s/it, loss=0.856, lr=0.0004]
Steps: 41%|████ | 206/500 [15:40<17:08, 3.50s/it, loss=1.06, lr=0.0004]
Steps: 41%|████▏ | 207/500 [15:42<14:41, 3.01s/it, loss=1.06, lr=0.0004]
Steps: 41%|████▏ | 207/500 [15:42<14:41, 3.01s/it, loss=0.81, lr=0.000399]
Steps: 42%|████▏ | 208/500 [15:44<12:59, 2.67s/it, loss=0.81, lr=0.000399]
Steps: 42%|████▏ | 208/500 [15:44<12:59, 2.67s/it, loss=0.961, lr=0.000399]
Steps: 42%|████▏ | 209/500 [15:52<20:15, 4.18s/it, loss=0.961, lr=0.000399]
Steps: 42%|████▏ | 209/500 [15:52<20:15, 4.18s/it, loss=0.809, lr=0.000399]
Steps: 42%|████▏ | 210/500 [15:54<16:50, 3.48s/it, loss=0.809, lr=0.000399]
Steps: 42%|████▏ | 210/500 [15:54<16:50, 3.48s/it, loss=0.983, lr=0.000399]
Steps: 42%|████▏ | 211/500 [15:55<14:27, 3.00s/it, loss=0.983, lr=0.000399]
Steps: 42%|████▏ | 211/500 [15:55<14:27, 3.00s/it, loss=0.865, lr=0.000399]
Steps: 42%|████▏ | 212/500 [15:57<12:46, 2.66s/it, loss=0.865, lr=0.000399]
Steps: 42%|████▏ | 212/500 [15:57<12:46, 2.66s/it, loss=0.927, lr=0.000398]
Steps: 43%|████▎ | 213/500 [16:05<19:55, 4.17s/it, loss=0.927, lr=0.000398]
Steps: 43%|████▎ | 213/500 [16:05<19:55, 4.17s/it, loss=0.799, lr=0.000398]
Steps: 43%|████▎ | 214/500 [16:07<16:34, 3.48s/it, loss=0.799, lr=0.000398]
Steps: 43%|████▎ | 214/500 [16:07<16:34, 3.48s/it, loss=1.02, lr=0.000398]
Steps: 43%|████▎ | 215/500 [16:09<14:13, 3.00s/it, loss=1.02, lr=0.000398]
Steps: 43%|████▎ | 215/500 [16:09<14:13, 3.00s/it, loss=0.864, lr=0.000398]
Steps: 43%|████▎ | 216/500 [16:11<12:34, 2.66s/it, loss=0.864, lr=0.000398]
Steps: 43%|████▎ | 216/500 [16:11<12:34, 2.66s/it, loss=0.976, lr=0.000397]
Steps: 43%|████▎ | 217/500 [16:18<19:35, 4.15s/it, loss=0.976, lr=0.000397]
Steps: 43%|████▎ | 217/500 [16:18<19:35, 4.15s/it, loss=0.859, lr=0.000397]
Steps: 44%|████▎ | 218/500 [16:20<16:18, 3.47s/it, loss=0.859, lr=0.000397]
Steps: 44%|████▎ | 218/500 [16:20<16:18, 3.47s/it, loss=0.9, lr=0.000396]
Steps: 44%|████▍ | 219/500 [16:22<14:00, 2.99s/it, loss=0.9, lr=0.000396]
Steps: 44%|████▍ | 219/500 [16:22<14:00, 2.99s/it, loss=0.935, lr=0.000396]
Steps: 44%|████▍ | 220/500 [16:24<12:23, 2.66s/it, loss=0.935, lr=0.000396]
Steps: 44%|████▍ | 220/500 [16:24<12:23, 2.66s/it, loss=0.919, lr=0.000396]
Steps: 44%|████▍ | 221/500 [16:31<19:15, 4.14s/it, loss=0.919, lr=0.000396]
Steps: 44%|████▍ | 221/500 [16:31<19:15, 4.14s/it, loss=0.849, lr=0.000395]
Steps: 44%|████▍ | 222/500 [16:33<16:01, 3.46s/it, loss=0.849, lr=0.000395]
Steps: 44%|████▍ | 222/500 [16:33<16:01, 3.46s/it, loss=0.985, lr=0.000395]
Steps: 45%|████▍ | 223/500 [16:35<13:46, 2.98s/it, loss=0.985, lr=0.000395]
Steps: 45%|████▍ | 223/500 [16:35<13:46, 2.98s/it, loss=0.798, lr=0.000394]
Steps: 45%|████▍ | 224/500 [16:37<12:11, 2.65s/it, loss=0.798, lr=0.000394]
Steps: 45%|████▍ | 224/500 [16:37<12:11, 2.65s/it, loss=0.896, lr=0.000394]
Steps: 45%|████▌ | 225/500 [16:45<19:01, 4.15s/it, loss=0.896, lr=0.000394]
Steps: 45%|████▌ | 225/500 [16:45<19:01, 4.15s/it, loss=0.772, lr=0.000393]
Steps: 45%|████▌ | 226/500 [16:47<15:50, 3.47s/it, loss=0.772, lr=0.000393]
Steps: 45%|████▌ | 226/500 [16:47<15:50, 3.47s/it, loss=0.968, lr=0.000393]
Steps: 45%|████▌ | 227/500 [16:48<13:36, 2.99s/it, loss=0.968, lr=0.000393]
Steps: 45%|████▌ | 227/500 [16:48<13:36, 2.99s/it, loss=0.943, lr=0.000392]
Steps: 46%|████▌ | 228/500 [16:50<12:01, 2.65s/it, loss=0.943, lr=0.000392]
Steps: 46%|████▌ | 228/500 [16:50<12:01, 2.65s/it, loss=0.951, lr=0.000391]
Steps: 46%|████▌ | 229/500 [16:58<18:52, 4.18s/it, loss=0.951, lr=0.000391]
Steps: 46%|████▌ | 229/500 [16:58<18:52, 4.18s/it, loss=0.839, lr=0.000391]
Steps: 46%|████▌ | 230/500 [17:00<15:41, 3.49s/it, loss=0.839, lr=0.000391]
Steps: 46%|████▌ | 230/500 [17:00<15:41, 3.49s/it, loss=1.02, lr=0.00039]
Steps: 46%|████▌ | 231/500 [17:02<13:27, 3.00s/it, loss=1.02, lr=0.00039]
Steps: 46%|████▌ | 231/500 [17:02<13:27, 3.00s/it, loss=0.854, lr=0.00039]
Steps: 46%|████▋ | 232/500 [17:04<11:53, 2.66s/it, loss=0.854, lr=0.00039]
Steps: 46%|████▋ | 232/500 [17:04<11:53, 2.66s/it, loss=0.958, lr=0.000389]
Steps: 47%|████▋ | 233/500 [17:11<18:44, 4.21s/it, loss=0.958, lr=0.000389]
Steps: 47%|████▋ | 233/500 [17:11<18:44, 4.21s/it, loss=1.06, lr=0.000388]
Steps: 47%|████▋ | 234/500 [17:13<15:33, 3.51s/it, loss=1.06, lr=0.000388]
Steps: 47%|████▋ | 234/500 [17:13<15:33, 3.51s/it, loss=1.07, lr=0.000387]
Steps: 47%|████▋ | 235/500 [17:15<13:20, 3.02s/it, loss=1.07, lr=0.000387]
Steps: 47%|████▋ | 235/500 [17:15<13:20, 3.02s/it, loss=0.996, lr=0.000387]
Steps: 47%|████▋ | 236/500 [17:17<11:46, 2.68s/it, loss=0.996, lr=0.000387]
Steps: 47%|████▋ | 236/500 [17:17<11:46, 2.68s/it, loss=0.889, lr=0.000386]
Steps: 47%|████▋ | 237/500 [17:25<18:25, 4.20s/it, loss=0.889, lr=0.000386]
Steps: 47%|████▋ | 237/500 [17:25<18:25, 4.20s/it, loss=0.789, lr=0.000385]
Steps: 48%|████▊ | 238/500 [17:27<15:18, 3.51s/it, loss=0.789, lr=0.000385]
Steps: 48%|████▊ | 238/500 [17:27<15:18, 3.51s/it, loss=1.04, lr=0.000384]
Steps: 48%|████▊ | 239/500 [17:29<13:07, 3.02s/it, loss=1.04, lr=0.000384]
Steps: 48%|████▊ | 239/500 [17:29<13:07, 3.02s/it, loss=0.85, lr=0.000384]
Steps: 48%|████▊ | 240/500 [17:31<11:35, 2.67s/it, loss=0.85, lr=0.000384]
Steps: 48%|████▊ | 240/500 [17:31<11:35, 2.67s/it, loss=0.976, lr=0.000383]
Steps: 48%|████▊ | 241/500 [17:38<18:00, 4.17s/it, loss=0.976, lr=0.000383]
Steps: 48%|████▊ | 241/500 [17:38<18:00, 4.17s/it, loss=0.842, lr=0.000382]
Steps: 48%|████▊ | 242/500 [17:40<14:57, 3.48s/it, loss=0.842, lr=0.000382]
Steps: 48%|████▊ | 242/500 [17:40<14:57, 3.48s/it, loss=1.01, lr=0.000381]
Steps: 49%|████▊ | 243/500 [17:42<12:50, 3.00s/it, loss=1.01, lr=0.000381]
Steps: 49%|████▊ | 243/500 [17:42<12:50, 3.00s/it, loss=0.848, lr=0.00038]
Steps: 49%|████▉ | 244/500 [17:44<11:21, 2.66s/it, loss=0.848, lr=0.00038]
Steps: 49%|████▉ | 244/500 [17:44<11:21, 2.66s/it, loss=1.07, lr=0.000379]
Steps: 49%|████▉ | 245/500 [17:51<17:42, 4.17s/it, loss=1.07, lr=0.000379]
Steps: 49%|████▉ | 245/500 [17:51<17:42, 4.17s/it, loss=1.05, lr=0.000378]
Steps: 49%|████▉ | 246/500 [17:53<14:43, 3.48s/it, loss=1.05, lr=0.000378]
Steps: 49%|████▉ | 246/500 [17:53<14:43, 3.48s/it, loss=0.908, lr=0.000377]
Steps: 49%|████▉ | 247/500 [17:55<12:38, 3.00s/it, loss=0.908, lr=0.000377]
Steps: 49%|████▉ | 247/500 [17:55<12:38, 3.00s/it, loss=0.8, lr=0.000376]
Steps: 50%|████▉ | 248/500 [17:57<11:10, 2.66s/it, loss=0.8, lr=0.000376]
Steps: 50%|████▉ | 248/500 [17:57<11:10, 2.66s/it, loss=1.07, lr=0.000375]
Steps: 50%|████▉ | 249/500 [18:05<17:29, 4.18s/it, loss=1.07, lr=0.000375]
Steps: 50%|████▉ | 249/500 [18:05<17:29, 4.18s/it, loss=0.966, lr=0.000374]
Steps: 50%|█████ | 250/500 [18:07<14:31, 3.49s/it, loss=0.966, lr=0.000374]
Steps: 50%|█████ | 250/500 [18:07<14:31, 3.49s/it, loss=1.07, lr=0.000373]
Steps: 50%|█████ | 251/500 [18:09<12:27, 3.00s/it, loss=1.07, lr=0.000373]
Steps: 50%|█████ | 251/500 [18:09<12:27, 3.00s/it, loss=0.789, lr=0.000372]
Steps: 50%|█████ | 252/500 [18:10<11:00, 2.66s/it, loss=0.789, lr=0.000372]
Steps: 50%|█████ | 252/500 [18:10<11:00, 2.66s/it, loss=0.945, lr=0.000371]
Steps: 51%|█████ | 253/500 [18:18<17:08, 4.16s/it, loss=0.945, lr=0.000371]
Steps: 51%|█████ | 253/500 [18:18<17:08, 4.16s/it, loss=0.83, lr=0.00037]
Steps: 51%|█████ | 254/500 [18:20<14:15, 3.48s/it, loss=0.83, lr=0.00037]
Steps: 51%|█████ | 254/500 [18:20<14:15, 3.48s/it, loss=0.999, lr=0.000369]
Steps: 51%|█████ | 255/500 [18:22<12:13, 3.00s/it, loss=0.999, lr=0.000369]
Steps: 51%|█████ | 255/500 [18:22<12:13, 3.00s/it, loss=0.883, lr=0.000368]
Steps: 51%|█████ | 256/500 [18:24<10:48, 2.66s/it, loss=0.883, lr=0.000368]
Steps: 51%|█████ | 256/500 [18:24<10:48, 2.66s/it, loss=1.07, lr=0.000367]
Steps: 51%|█████▏ | 257/500 [18:32<17:04, 4.21s/it, loss=1.07, lr=0.000367]
Steps: 51%|█████▏ | 257/500 [18:32<17:04, 4.21s/it, loss=0.81, lr=0.000365]
Steps: 52%|█████▏ | 258/500 [18:33<14:10, 3.51s/it, loss=0.81, lr=0.000365]
Steps: 52%|█████▏ | 258/500 [18:33<14:10, 3.51s/it, loss=0.94, lr=0.000364]
Steps: 52%|█████▏ | 259/500 [18:35<12:08, 3.02s/it, loss=0.94, lr=0.000364]
Steps: 52%|█████▏ | 259/500 [18:35<12:08, 3.02s/it, loss=0.963, lr=0.000363]
Steps: 52%|█████▏ | 260/500 [18:37<10:42, 2.68s/it, loss=0.963, lr=0.000363]
Steps: 52%|█████▏ | 260/500 [18:37<10:42, 2.68s/it, loss=0.942, lr=0.000362]
Steps: 52%|█████▏ | 261/500 [18:45<16:53, 4.24s/it, loss=0.942, lr=0.000362]
Steps: 52%|█████▏ | 261/500 [18:45<16:53, 4.24s/it, loss=0.962, lr=0.000361]
Steps: 52%|█████▏ | 262/500 [18:47<14:00, 3.53s/it, loss=0.962, lr=0.000361]
Steps: 52%|█████▏ | 262/500 [18:47<14:00, 3.53s/it, loss=0.922, lr=0.000359]
Steps: 53%|█████▎ | 263/500 [18:49<11:58, 3.03s/it, loss=0.922, lr=0.000359]
Steps: 53%|█████▎ | 263/500 [18:49<11:58, 3.03s/it, loss=0.8, lr=0.000358]
Steps: 53%|█████▎ | 264/500 [18:51<10:33, 2.69s/it, loss=0.8, lr=0.000358]
Steps: 53%|█████▎ | 264/500 [18:51<10:33, 2.69s/it, loss=0.954, lr=0.000357]
Steps: 53%|█████▎ | 265/500 [18:58<16:26, 4.20s/it, loss=0.954, lr=0.000357]
Steps: 53%|█████▎ | 265/500 [18:58<16:26, 4.20s/it, loss=0.852, lr=0.000355]
Steps: 53%|█████▎ | 266/500 [19:00<13:39, 3.50s/it, loss=0.852, lr=0.000355]
Steps: 53%|█████▎ | 266/500 [19:00<13:39, 3.50s/it, loss=0.9, lr=0.000354]
Steps: 53%|█████▎ | 267/500 [19:02<11:42, 3.02s/it, loss=0.9, lr=0.000354]
Steps: 53%|█████▎ | 267/500 [19:02<11:42, 3.02s/it, loss=0.838, lr=0.000353]
Steps: 54%|█████▎ | 268/500 [19:04<10:20, 2.67s/it, loss=0.838, lr=0.000353]
Steps: 54%|█████▎ | 268/500 [19:04<10:20, 2.67s/it, loss=1.07, lr=0.000351]
Steps: 54%|█████▍ | 269/500 [19:12<16:02, 4.17s/it, loss=1.07, lr=0.000351]
Steps: 54%|█████▍ | 269/500 [19:12<16:02, 4.17s/it, loss=0.983, lr=0.00035]
Steps: 54%|█████▍ | 270/500 [19:14<13:20, 3.48s/it, loss=0.983, lr=0.00035]
Steps: 54%|█████▍ | 270/500 [19:14<13:20, 3.48s/it, loss=0.957, lr=0.000349]
Steps: 54%|█████▍ | 271/500 [19:15<11:26, 3.00s/it, loss=0.957, lr=0.000349]
Steps: 54%|█████▍ | 271/500 [19:15<11:26, 3.00s/it, loss=0.828, lr=0.000347]
Steps: 54%|█████▍ | 272/500 [19:17<10:06, 2.66s/it, loss=0.828, lr=0.000347]
Steps: 54%|█████▍ | 272/500 [19:17<10:06, 2.66s/it, loss=0.946, lr=0.000346]
Steps: 55%|█████▍ | 273/500 [19:25<15:43, 4.16s/it, loss=0.946, lr=0.000346]
Steps: 55%|█████▍ | 273/500 [19:25<15:43, 4.16s/it, loss=1.01, lr=0.000344]
Steps: 55%|█████▍ | 274/500 [19:27<13:04, 3.47s/it, loss=1.01, lr=0.000344]
Steps: 55%|█████▍ | 274/500 [19:27<13:04, 3.47s/it, loss=0.915, lr=0.000343]
Steps: 55%|█████▌ | 275/500 [19:29<11:13, 2.99s/it, loss=0.915, lr=0.000343]
Steps: 55%|█████▌ | 275/500 [19:29<11:13, 2.99s/it, loss=0.881, lr=0.000341]
Steps: 55%|█████▌ | 276/500 [19:31<09:55, 2.66s/it, loss=0.881, lr=0.000341]
Steps: 55%|█████▌ | 276/500 [19:31<09:55, 2.66s/it, loss=0.896, lr=0.00034]
Steps: 55%|█████▌ | 277/500 [19:38<15:23, 4.14s/it, loss=0.896, lr=0.00034]
Steps: 55%|█████▌ | 277/500 [19:38<15:23, 4.14s/it, loss=0.863, lr=0.000338]
Steps: 56%|█████▌ | 278/500 [19:40<12:48, 3.46s/it, loss=0.863, lr=0.000338]
Steps: 56%|█████▌ | 278/500 [19:40<12:48, 3.46s/it, loss=0.968, lr=0.000337]
Steps: 56%|█████▌ | 279/500 [19:42<10:59, 2.99s/it, loss=0.968, lr=0.000337]
Steps: 56%|█████▌ | 279/500 [19:42<10:59, 2.99s/it, loss=0.817, lr=0.000335]
Steps: 56%|█████▌ | 280/500 [19:44<09:43, 2.65s/it, loss=0.817, lr=0.000335]
Steps: 56%|█████▌ | 280/500 [19:44<09:43, 2.65s/it, loss=1.07, lr=0.000334]
Steps: 56%|█████▌ | 281/500 [19:51<15:08, 4.15s/it, loss=1.07, lr=0.000334]
Steps: 56%|█████▌ | 281/500 [19:51<15:08, 4.15s/it, loss=0.795, lr=0.000332]
Steps: 56%|█████▋ | 282/500 [19:53<12:35, 3.46s/it, loss=0.795, lr=0.000332]
Steps: 56%|█████▋ | 282/500 [19:53<12:35, 3.46s/it, loss=0.99, lr=0.000331]
Steps: 57%|█████▋ | 283/500 [19:55<10:48, 2.99s/it, loss=0.99, lr=0.000331]
Steps: 57%|█████▋ | 283/500 [19:55<10:48, 2.99s/it, loss=0.844, lr=0.000329]
Steps: 57%|█████▋ | 284/500 [19:57<09:32, 2.65s/it, loss=0.844, lr=0.000329]
Steps: 57%|█████▋ | 284/500 [19:57<09:32, 2.65s/it, loss=0.94, lr=0.000327]
Steps: 57%|█████▋ | 285/500 [20:05<14:54, 4.16s/it, loss=0.94, lr=0.000327]
Steps: 57%|█████▋ | 285/500 [20:05<14:54, 4.16s/it, loss=0.9, lr=0.000326]
Steps: 57%|█████▋ | 286/500 [20:07<12:24, 3.48s/it, loss=0.9, lr=0.000326]
Steps: 57%|█████▋ | 286/500 [20:07<12:24, 3.48s/it, loss=1.06, lr=0.000324]
Steps: 57%|█████▋ | 287/500 [20:09<10:38, 3.00s/it, loss=1.06, lr=0.000324]
Steps: 57%|█████▋ | 287/500 [20:09<10:38, 3.00s/it, loss=1.02, lr=0.000323]
Steps: 58%|█████▊ | 288/500 [20:10<09:23, 2.66s/it, loss=1.02, lr=0.000323]
Steps: 58%|█████▊ | 288/500 [20:10<09:23, 2.66s/it, loss=1.03, lr=0.000321]
Steps: 58%|█████▊ | 289/500 [20:18<14:35, 4.15s/it, loss=1.03, lr=0.000321]
Steps: 58%|█████▊ | 289/500 [20:18<14:35, 4.15s/it, loss=1.05, lr=0.000319]
Steps: 58%|█████▊ | 290/500 [20:20<12:07, 3.47s/it, loss=1.05, lr=0.000319]
Steps: 58%|█████▊ | 290/500 [20:20<12:07, 3.47s/it, loss=0.899, lr=0.000318]
Steps: 58%|█████▊ | 291/500 [20:22<10:24, 2.99s/it, loss=0.899, lr=0.000318]
Steps: 58%|█████▊ | 291/500 [20:22<10:24, 2.99s/it, loss=1.03, lr=0.000316]
Steps: 58%|█████▊ | 292/500 [20:24<09:11, 2.65s/it, loss=1.03, lr=0.000316]
Steps: 58%|█████▊ | 292/500 [20:24<09:11, 2.65s/it, loss=1.03, lr=0.000314]
Steps: 59%|█████▊ | 293/500 [20:31<14:28, 4.19s/it, loss=1.03, lr=0.000314]
Steps: 59%|█████▊ | 293/500 [20:31<14:28, 4.19s/it, loss=0.821, lr=0.000312]
Steps: 59%|█████▉ | 294/500 [20:33<12:00, 3.50s/it, loss=0.821, lr=0.000312]
Steps: 59%|█████▉ | 294/500 [20:33<12:00, 3.50s/it, loss=0.884, lr=0.000311]
Steps: 59%|█████▉ | 295/500 [20:35<10:17, 3.01s/it, loss=0.884, lr=0.000311]
Steps: 59%|█████▉ | 295/500 [20:35<10:17, 3.01s/it, loss=0.792, lr=0.000309]
Steps: 59%|█████▉ | 296/500 [20:37<09:04, 2.67s/it, loss=0.792, lr=0.000309]
Steps: 59%|█████▉ | 296/500 [20:37<09:04, 2.67s/it, loss=1.01, lr=0.000307]
Steps: 59%|█████▉ | 297/500 [20:45<14:02, 4.15s/it, loss=1.01, lr=0.000307]
Steps: 59%|█████▉ | 297/500 [20:45<14:02, 4.15s/it, loss=0.787, lr=0.000305]
Steps: 60%|█████▉ | 298/500 [20:47<11:40, 3.47s/it, loss=0.787, lr=0.000305]
Steps: 60%|█████▉ | 298/500 [20:47<11:40, 3.47s/it, loss=0.909, lr=0.000304]
Steps: 60%|█████▉ | 299/500 [20:48<10:00, 2.99s/it, loss=0.909, lr=0.000304]
Steps: 60%|█████▉ | 299/500 [20:48<10:00, 2.99s/it, loss=0.832, lr=0.000302]
Steps: 60%|██████ | 300/500 [20:50<08:50, 2.65s/it, loss=0.832, lr=0.000302]
Steps: 60%|██████ | 300/500 [20:50<08:50, 2.65s/it, loss=0.945, lr=0.0003]
Steps: 60%|██████ | 301/500 [20:58<13:47, 4.16s/it, loss=0.945, lr=0.0003]
Steps: 60%|██████ | 301/500 [20:58<13:47, 4.16s/it, loss=0.866, lr=0.000298]
Steps: 60%|██████ | 302/500 [21:00<11:27, 3.47s/it, loss=0.866, lr=0.000298]
Steps: 60%|██████ | 302/500 [21:00<11:27, 3.47s/it, loss=0.905, lr=0.000296]
Steps: 61%|██████ | 303/500 [21:02<09:49, 2.99s/it, loss=0.905, lr=0.000296]
Steps: 61%|██████ | 303/500 [21:02<09:49, 2.99s/it, loss=0.818, lr=0.000295]
Steps: 61%|██████ | 304/500 [21:04<08:40, 2.66s/it, loss=0.818, lr=0.000295]
Steps: 61%|██████ | 304/500 [21:04<08:40, 2.66s/it, loss=0.912, lr=0.000293]
Steps: 61%|██████ | 305/500 [21:11<13:32, 4.16s/it, loss=0.912, lr=0.000293]
Steps: 61%|██████ | 305/500 [21:11<13:32, 4.16s/it, loss=0.784, lr=0.000291]
Steps: 61%|██████ | 306/500 [21:13<11:14, 3.48s/it, loss=0.784, lr=0.000291]
Steps: 61%|██████ | 306/500 [21:13<11:14, 3.48s/it, loss=1.03, lr=0.000289]
Steps: 61%|██████▏ | 307/500 [21:15<09:38, 3.00s/it, loss=1.03, lr=0.000289]
Steps: 61%|██████▏ | 307/500 [21:15<09:38, 3.00s/it, loss=1.05, lr=0.000287]
Steps: 62%|██████▏ | 308/500 [21:17<08:30, 2.66s/it, loss=1.05, lr=0.000287]
Steps: 62%|██████▏ | 308/500 [21:17<08:30, 2.66s/it, loss=1.04, lr=0.000285]
Steps: 62%|██████▏ | 309/500 [21:25<13:15, 4.16s/it, loss=1.04, lr=0.000285]
Steps: 62%|██████▏ | 309/500 [21:25<13:15, 4.16s/it, loss=1.06, lr=0.000283]
Steps: 62%|██████▏ | 310/500 [21:26<11:00, 3.47s/it, loss=1.06, lr=0.000283]
Steps: 62%|██████▏ | 310/500 [21:26<11:00, 3.47s/it, loss=0.99, lr=0.000281]
Steps: 62%|██████▏ | 311/500 [21:28<09:25, 2.99s/it, loss=0.99, lr=0.000281]
Steps: 62%|██████▏ | 311/500 [21:28<09:25, 2.99s/it, loss=0.86, lr=0.000279]
Steps: 62%|██████▏ | 312/500 [21:30<08:19, 2.66s/it, loss=0.86, lr=0.000279]
Steps: 62%|██████▏ | 312/500 [21:30<08:19, 2.66s/it, loss=0.877, lr=0.000278]
Steps: 63%|██████▎ | 313/500 [21:38<13:00, 4.17s/it, loss=0.877, lr=0.000278]
Steps: 63%|██████▎ | 313/500 [21:38<13:00, 4.17s/it, loss=0.82, lr=0.000276]
Steps: 63%|██████▎ | 314/500 [21:40<10:47, 3.48s/it, loss=0.82, lr=0.000276]
Steps: 63%|██████▎ | 314/500 [21:40<10:47, 3.48s/it, loss=0.89, lr=0.000274]
Steps: 63%|██████▎ | 315/500 [21:42<09:15, 3.00s/it, loss=0.89, lr=0.000274]
Steps: 63%|██████▎ | 315/500 [21:42<09:15, 3.00s/it, loss=0.855, lr=0.000272]
Steps: 63%|██████▎ | 316/500 [21:43<08:09, 2.66s/it, loss=0.855, lr=0.000272]
Steps: 63%|██████▎ | 316/500 [21:43<08:09, 2.66s/it, loss=1.01, lr=0.00027]
Steps: 63%|██████▎ | 317/500 [21:51<12:43, 4.17s/it, loss=1.01, lr=0.00027]
Steps: 63%|██████▎ | 317/500 [21:51<12:43, 4.17s/it, loss=0.9, lr=0.000268]
Steps: 64%|██████▎ | 318/500 [21:53<10:33, 3.48s/it, loss=0.9, lr=0.000268]
Steps: 64%|██████▎ | 318/500 [21:53<10:33, 3.48s/it, loss=0.966, lr=0.000266]
Steps: 64%|██████▍ | 319/500 [21:55<09:02, 3.00s/it, loss=0.966, lr=0.000266]
Steps: 64%|██████▍ | 319/500 [21:55<09:02, 3.00s/it, loss=0.968, lr=0.000264]
Steps: 64%|██████▍ | 320/500 [21:57<07:58, 2.66s/it, loss=0.968, lr=0.000264]
Steps: 64%|██████▍ | 320/500 [21:57<07:58, 2.66s/it, loss=0.891, lr=0.000262]
Steps: 64%|██████▍ | 321/500 [22:04<12:20, 4.14s/it, loss=0.891, lr=0.000262]
Steps: 64%|██████▍ | 321/500 [22:04<12:20, 4.14s/it, loss=0.787, lr=0.00026]
Steps: 64%|██████▍ | 322/500 [22:06<10:15, 3.46s/it, loss=0.787, lr=0.00026]
Steps: 64%|██████▍ | 322/500 [22:06<10:15, 3.46s/it, loss=0.878, lr=0.000258]
Steps: 65%|██████▍ | 323/500 [22:08<08:47, 2.98s/it, loss=0.878, lr=0.000258]
Steps: 65%|██████▍ | 323/500 [22:08<08:47, 2.98s/it, loss=0.852, lr=0.000256]
Steps: 65%|██████▍ | 324/500 [22:10<07:46, 2.65s/it, loss=0.852, lr=0.000256]
Steps: 65%|██████▍ | 324/500 [22:10<07:46, 2.65s/it, loss=1.02, lr=0.000254]
Steps: 65%|██████▌ | 325/500 [22:18<12:03, 4.14s/it, loss=1.02, lr=0.000254]
Steps: 65%|██████▌ | 325/500 [22:18<12:03, 4.14s/it, loss=0.878, lr=0.000252]
Steps: 65%|██████▌ | 326/500 [22:19<10:01, 3.46s/it, loss=0.878, lr=0.000252]
Steps: 65%|██████▌ | 326/500 [22:19<10:01, 3.46s/it, loss=0.878, lr=0.00025]
Steps: 65%|██████▌ | 327/500 [22:21<08:35, 2.98s/it, loss=0.878, lr=0.00025]
Steps: 65%|██████▌ | 327/500 [22:21<08:35, 2.98s/it, loss=0.845, lr=0.000248]
Steps: 66%|██████▌ | 328/500 [22:23<07:35, 2.65s/it, loss=0.845, lr=0.000248]
Steps: 66%|██████▌ | 328/500 [22:23<07:35, 2.65s/it, loss=0.905, lr=0.000246]
Steps: 66%|██████▌ | 329/500 [22:31<11:55, 4.18s/it, loss=0.905, lr=0.000246]
Steps: 66%|██████▌ | 329/500 [22:31<11:55, 4.18s/it, loss=1.05, lr=0.000244]
Steps: 66%|██████▌ | 330/500 [22:33<09:53, 3.49s/it, loss=1.05, lr=0.000244]
Steps: 66%|██████▌ | 330/500 [22:33<09:53, 3.49s/it, loss=0.936, lr=0.000242]
Steps: 66%|██████▌ | 331/500 [22:35<08:27, 3.00s/it, loss=0.936, lr=0.000242]
Steps: 66%|██████▌ | 331/500 [22:35<08:27, 3.00s/it, loss=0.834, lr=0.00024]
Steps: 66%|██████▋ | 332/500 [22:37<07:27, 2.67s/it, loss=0.834, lr=0.00024]
Steps: 66%|██████▋ | 332/500 [22:37<07:27, 2.67s/it, loss=1, lr=0.000237]
Steps: 67%|██████▋ | 333/500 [22:44<11:31, 4.14s/it, loss=1, lr=0.000237]
Steps: 67%|██████▋ | 333/500 [22:44<11:31, 4.14s/it, loss=0.791, lr=0.000235]
Steps: 67%|██████▋ | 334/500 [22:46<09:34, 3.46s/it, loss=0.791, lr=0.000235]
Steps: 67%|██████▋ | 334/500 [22:46<09:34, 3.46s/it, loss=0.893, lr=0.000233]
Steps: 67%|██████▋ | 335/500 [22:48<08:12, 2.98s/it, loss=0.893, lr=0.000233]
Steps: 67%|██████▋ | 335/500 [22:48<08:12, 2.98s/it, loss=1.03, lr=0.000231]
Steps: 67%|██████▋ | 336/500 [22:50<07:14, 2.65s/it, loss=1.03, lr=0.000231]
Steps: 67%|██████▋ | 336/500 [22:50<07:14, 2.65s/it, loss=1.03, lr=0.000229]
Steps: 67%|██████▋ | 337/500 [22:58<11:23, 4.20s/it, loss=1.03, lr=0.000229]
Steps: 67%|██████▋ | 337/500 [22:58<11:23, 4.20s/it, loss=1.03, lr=0.000227]
Steps: 68%|██████▊ | 338/500 [22:59<09:26, 3.50s/it, loss=1.03, lr=0.000227]
Steps: 68%|██████▊ | 338/500 [22:59<09:26, 3.50s/it, loss=0.882, lr=0.000225]
Steps: 68%|██████▊ | 339/500 [23:01<08:04, 3.01s/it, loss=0.882, lr=0.000225]
Steps: 68%|██████▊ | 339/500 [23:01<08:04, 3.01s/it, loss=0.792, lr=0.000223]
Steps: 68%|██████▊ | 340/500 [23:03<07:07, 2.67s/it, loss=0.792, lr=0.000223]
Steps: 68%|██████▊ | 340/500 [23:03<07:07, 2.67s/it, loss=0.974, lr=0.000221]
Steps: 68%|██████▊ | 341/500 [23:11<10:57, 4.14s/it, loss=0.974, lr=0.000221]
Steps: 68%|██████▊ | 341/500 [23:11<10:57, 4.14s/it, loss=0.83, lr=0.000219]
Steps: 68%|██████▊ | 342/500 [23:13<09:06, 3.46s/it, loss=0.83, lr=0.000219]
Steps: 68%|██████▊ | 342/500 [23:13<09:06, 3.46s/it, loss=0.874, lr=0.000217]
Steps: 69%|██████▊ | 343/500 [23:14<07:48, 2.98s/it, loss=0.874, lr=0.000217]
Steps: 69%|██████▊ | 343/500 [23:15<07:48, 2.98s/it, loss=0.789, lr=0.000215]
Steps: 69%|██████▉ | 344/500 [23:16<06:53, 2.65s/it, loss=0.789, lr=0.000215]
Steps: 69%|██████▉ | 344/500 [23:16<06:53, 2.65s/it, loss=0.975, lr=0.000213]
Steps: 69%|██████▉ | 345/500 [23:24<10:41, 4.14s/it, loss=0.975, lr=0.000213]
Steps: 69%|██████▉ | 345/500 [23:24<10:41, 4.14s/it, loss=0.838, lr=0.00021]
Steps: 69%|██████▉ | 346/500 [23:26<08:52, 3.46s/it, loss=0.838, lr=0.00021]
Steps: 69%|██████▉ | 346/500 [23:26<08:52, 3.46s/it, loss=1.02, lr=0.000208]
Steps: 69%|██████▉ | 347/500 [23:28<07:36, 2.98s/it, loss=1.02, lr=0.000208]
Steps: 69%|██████▉ | 347/500 [23:28<07:36, 2.98s/it, loss=0.815, lr=0.000206]
Steps: 70%|██████▉ | 348/500 [23:30<06:43, 2.65s/it, loss=0.815, lr=0.000206]
Steps: 70%|██████▉ | 348/500 [23:30<06:43, 2.65s/it, loss=0.865, lr=0.000204]
Steps: 70%|██████▉ | 349/500 [23:37<10:27, 4.15s/it, loss=0.865, lr=0.000204]
Steps: 70%|██████▉ | 349/500 [23:37<10:27, 4.15s/it, loss=0.806, lr=0.000202]
Steps: 70%|███████ | 350/500 [23:39<08:40, 3.47s/it, loss=0.806, lr=0.000202]
Steps: 70%|███████ | 350/500 [23:39<08:40, 3.47s/it, loss=0.869, lr=0.0002]
Steps: 70%|███████ | 351/500 [23:41<07:25, 2.99s/it, loss=0.869, lr=0.0002]
Steps: 70%|███████ | 351/500 [23:41<07:25, 2.99s/it, loss=0.812, lr=0.000198]
Steps: 70%|███████ | 352/500 [23:43<06:33, 2.66s/it, loss=0.812, lr=0.000198]
Steps: 70%|███████ | 352/500 [23:43<06:33, 2.66s/it, loss=1.01, lr=0.000196]
Steps: 71%|███████ | 353/500 [23:51<10:15, 4.19s/it, loss=1.01, lr=0.000196]
Steps: 71%|███████ | 353/500 [23:51<10:15, 4.19s/it, loss=1.01, lr=0.000194]
Steps: 71%|███████ | 354/500 [23:53<08:29, 3.49s/it, loss=1.01, lr=0.000194]
Steps: 71%|███████ | 354/500 [23:53<08:29, 3.49s/it, loss=0.951, lr=0.000192]
Steps: 71%|███████ | 355/500 [23:54<07:15, 3.01s/it, loss=0.951, lr=0.000192]
Steps: 71%|███████ | 355/500 [23:54<07:15, 3.01s/it, loss=0.849, lr=0.00019]
Steps: 71%|███████ | 356/500 [23:56<06:23, 2.67s/it, loss=0.849, lr=0.00019]
Steps: 71%|███████ | 356/500 [23:56<06:23, 2.67s/it, loss=1.06, lr=0.000187]
Steps: 71%|███████▏ | 357/500 [24:04<09:51, 4.14s/it, loss=1.06, lr=0.000187]
Steps: 71%|███████▏ | 357/500 [24:04<09:51, 4.14s/it, loss=1.03, lr=0.000185]
Steps: 72%|███████▏ | 358/500 [24:06<08:10, 3.46s/it, loss=1.03, lr=0.000185]
Steps: 72%|███████▏ | 358/500 [24:06<08:10, 3.46s/it, loss=0.889, lr=0.000183]
Steps: 72%|███████▏ | 359/500 [24:08<07:00, 2.98s/it, loss=0.889, lr=0.000183]
Steps: 72%|███████▏ | 359/500 [24:08<07:00, 2.98s/it, loss=0.818, lr=0.000181]
Steps: 72%|███████▏ | 360/500 [24:09<06:10, 2.65s/it, loss=0.818, lr=0.000181]
Steps: 72%|███████▏ | 360/500 [24:09<06:10, 2.65s/it, loss=1, lr=0.000179]
Steps: 72%|███████▏ | 361/500 [24:17<09:34, 4.13s/it, loss=1, lr=0.000179]
Steps: 72%|███████▏ | 361/500 [24:17<09:34, 4.13s/it, loss=0.996, lr=0.000177]
Steps: 72%|███████▏ | 362/500 [24:19<07:57, 3.46s/it, loss=0.996, lr=0.000177]
Steps: 72%|███████▏ | 362/500 [24:19<07:57, 3.46s/it, loss=1.04, lr=0.000175]
Steps: 73%|███████▎ | 363/500 [24:21<06:48, 2.98s/it, loss=1.04, lr=0.000175]
Steps: 73%|███████▎ | 363/500 [24:21<06:48, 2.98s/it, loss=0.784, lr=0.000173]
Steps: 73%|███████▎ | 364/500 [24:23<06:00, 2.65s/it, loss=0.784, lr=0.000173]
Steps: 73%|███████▎ | 364/500 [24:23<06:00, 2.65s/it, loss=0.997, lr=0.000171]
Steps: 73%|███████▎ | 365/500 [24:30<09:22, 4.17s/it, loss=0.997, lr=0.000171]
Steps: 73%|███████▎ | 365/500 [24:30<09:22, 4.17s/it, loss=0.794, lr=0.000169]
Steps: 73%|███████▎ | 366/500 [24:32<07:45, 3.48s/it, loss=0.794, lr=0.000169]
Steps: 73%|███████▎ | 366/500 [24:32<07:45, 3.48s/it, loss=0.874, lr=0.000167]
Steps: 73%|███████▎ | 367/500 [24:34<06:38, 3.00s/it, loss=0.874, lr=0.000167]
Steps: 73%|███████▎ | 367/500 [24:34<06:38, 3.00s/it, loss=0.848, lr=0.000165]
Steps: 74%|███████▎ | 368/500 [24:36<05:50, 2.66s/it, loss=0.848, lr=0.000165]
Steps: 74%|███████▎ | 368/500 [24:36<05:50, 2.66s/it, loss=0.964, lr=0.000163]
Steps: 74%|███████▍ | 369/500 [24:44<09:07, 4.18s/it, loss=0.964, lr=0.000163]
Steps: 74%|███████▍ | 369/500 [24:44<09:07, 4.18s/it, loss=0.778, lr=0.00016]
Steps: 74%|███████▍ | 370/500 [24:46<07:33, 3.49s/it, loss=0.778, lr=0.00016]
Steps: 74%|███████▍ | 370/500 [24:46<07:33, 3.49s/it, loss=1.04, lr=0.000158]
Steps: 74%|███████▍ | 371/500 [24:47<06:27, 3.00s/it, loss=1.04, lr=0.000158]
Steps: 74%|███████▍ | 371/500 [24:47<06:27, 3.00s/it, loss=1, lr=0.000156]
Steps: 74%|███████▍ | 372/500 [24:49<05:41, 2.67s/it, loss=1, lr=0.000156]
Steps: 74%|███████▍ | 372/500 [24:49<05:41, 2.67s/it, loss=0.937, lr=0.000154]
Steps: 75%|███████▍ | 373/500 [24:57<08:47, 4.15s/it, loss=0.937, lr=0.000154]
Steps: 75%|███████▍ | 373/500 [24:57<08:47, 4.15s/it, loss=1.05, lr=0.000152]
Steps: 75%|███████▍ | 374/500 [24:59<07:17, 3.47s/it, loss=1.05, lr=0.000152]
Steps: 75%|███████▍ | 374/500 [24:59<07:17, 3.47s/it, loss=0.894, lr=0.00015]
Steps: 75%|███████▌ | 375/500 [25:01<06:13, 2.99s/it, loss=0.894, lr=0.00015]
Steps: 75%|███████▌ | 375/500 [25:01<06:13, 2.99s/it, loss=0.821, lr=0.000148]
Steps: 75%|███████▌ | 376/500 [25:03<05:29, 2.66s/it, loss=0.821, lr=0.000148]
Steps: 75%|███████▌ | 376/500 [25:03<05:29, 2.66s/it, loss=1.04, lr=0.000146]
Steps: 75%|███████▌ | 377/500 [25:10<08:34, 4.19s/it, loss=1.04, lr=0.000146]
Steps: 75%|███████▌ | 377/500 [25:10<08:34, 4.19s/it, loss=0.978, lr=0.000144]
Steps: 76%|███████▌ | 378/500 [25:12<07:05, 3.49s/it, loss=0.978, lr=0.000144]
Steps: 76%|███████▌ | 378/500 [25:12<07:05, 3.49s/it, loss=0.943, lr=0.000142]
Steps: 76%|███████▌ | 379/500 [25:14<06:03, 3.01s/it, loss=0.943, lr=0.000142]
Steps: 76%|███████▌ | 379/500 [25:14<06:03, 3.01s/it, loss=1.05, lr=0.00014]
Steps: 76%|███████▌ | 380/500 [25:16<05:20, 2.67s/it, loss=1.05, lr=0.00014]
Steps: 76%|███████▌ | 380/500 [25:16<05:20, 2.67s/it, loss=0.892, lr=0.000138]
Steps: 76%|███████▌ | 381/500 [25:24<08:12, 4.14s/it, loss=0.892, lr=0.000138]
Steps: 76%|███████▌ | 381/500 [25:24<08:12, 4.14s/it, loss=0.82, lr=0.000136]
Steps: 76%|███████▋ | 382/500 [25:25<06:48, 3.46s/it, loss=0.82, lr=0.000136]
Steps: 76%|███████▋ | 382/500 [25:25<06:48, 3.46s/it, loss=1.02, lr=0.000134]
Steps: 77%|███████▋ | 383/500 [25:27<05:49, 2.98s/it, loss=1.02, lr=0.000134]
Steps: 77%|███████▋ | 383/500 [25:27<05:49, 2.98s/it, loss=0.785, lr=0.000132]
Steps: 77%|███████▋ | 384/500 [25:29<05:07, 2.65s/it, loss=0.785, lr=0.000132]
Steps: 77%|███████▋ | 384/500 [25:29<05:07, 2.65s/it, loss=0.898, lr=0.00013]
Steps: 77%|███████▋ | 385/500 [25:37<07:56, 4.14s/it, loss=0.898, lr=0.00013]
Steps: 77%|███████▋ | 385/500 [25:37<07:56, 4.14s/it, loss=0.836, lr=0.000128]
Steps: 77%|███████▋ | 386/500 [25:39<06:34, 3.46s/it, loss=0.836, lr=0.000128]
Steps: 77%|███████▋ | 386/500 [25:39<06:34, 3.46s/it, loss=0.894, lr=0.000126]
Steps: 77%|███████▋ | 387/500 [25:41<05:37, 2.98s/it, loss=0.894, lr=0.000126]
Steps: 77%|███████▋ | 387/500 [25:41<05:37, 2.98s/it, loss=0.776, lr=0.000124]
Steps: 78%|███████▊ | 388/500 [25:42<04:56, 2.65s/it, loss=0.776, lr=0.000124]
Steps: 78%|███████▊ | 388/500 [25:42<04:56, 2.65s/it, loss=1.06, lr=0.000122]
Steps: 78%|███████▊ | 389/500 [25:50<07:38, 4.14s/it, loss=1.06, lr=0.000122]
Steps: 78%|███████▊ | 389/500 [25:50<07:38, 4.14s/it, loss=0.79, lr=0.000121]
Steps: 78%|███████▊ | 390/500 [25:52<06:20, 3.46s/it, loss=0.79, lr=0.000121]
Steps: 78%|███████▊ | 390/500 [25:52<06:20, 3.46s/it, loss=0.867, lr=0.000119]
Steps: 78%|███████▊ | 391/500 [25:54<05:25, 2.98s/it, loss=0.867, lr=0.000119]
Steps: 78%|███████▊ | 391/500 [25:54<05:25, 2.98s/it, loss=0.79, lr=0.000117]
Steps: 78%|███████▊ | 392/500 [25:56<04:46, 2.65s/it, loss=0.79, lr=0.000117]
Steps: 78%|███████▊ | 392/500 [25:56<04:46, 2.65s/it, loss=0.867, lr=0.000115]
Steps: 79%|███████▊ | 393/500 [26:03<07:21, 4.12s/it, loss=0.867, lr=0.000115]
Steps: 79%|███████▊ | 393/500 [26:03<07:21, 4.12s/it, loss=0.818, lr=0.000113]
Steps: 79%|███████▉ | 394/500 [26:05<06:05, 3.45s/it, loss=0.818, lr=0.000113]
Steps: 79%|███████▉ | 394/500 [26:05<06:05, 3.45s/it, loss=0.931, lr=0.000111]
Steps: 79%|███████▉ | 395/500 [26:07<05:12, 2.97s/it, loss=0.931, lr=0.000111]
Steps: 79%|███████▉ | 395/500 [26:07<05:12, 2.97s/it, loss=0.821, lr=0.000109]
Steps: 79%|███████▉ | 396/500 [26:09<04:35, 2.65s/it, loss=0.821, lr=0.000109]
Steps: 79%|███████▉ | 396/500 [26:09<04:35, 2.65s/it, loss=0.916, lr=0.000107]
Steps: 79%|███████▉ | 397/500 [26:17<07:10, 4.18s/it, loss=0.916, lr=0.000107]
Steps: 79%|███████▉ | 397/500 [26:17<07:10, 4.18s/it, loss=0.805, lr=0.000105]
Steps: 80%|███████▉ | 398/500 [26:18<05:55, 3.49s/it, loss=0.805, lr=0.000105]
Steps: 80%|███████▉ | 398/500 [26:18<05:55, 3.49s/it, loss=1.06, lr=0.000104]
Steps: 80%|███████▉ | 399/500 [26:20<05:03, 3.00s/it, loss=1.06, lr=0.000104]
Steps: 80%|███████▉ | 399/500 [26:20<05:03, 3.00s/it, loss=0.812, lr=0.000102]
Steps: 80%|████████ | 400/500 [26:22<04:26, 2.66s/it, loss=0.812, lr=0.000102]
Steps: 80%|████████ | 400/500 [26:22<04:26, 2.66s/it, loss=0.863, lr=0.0001]
Steps: 80%|████████ | 401/500 [26:30<06:54, 4.19s/it, loss=0.863, lr=0.0001]
Steps: 80%|████████ | 401/500 [26:30<06:54, 4.19s/it, loss=0.843, lr=9.82e-5]
Steps: 80%|████████ | 402/500 [26:32<05:42, 3.49s/it, loss=0.843, lr=9.82e-5]
Steps: 80%|████████ | 402/500 [26:32<05:42, 3.49s/it, loss=0.926, lr=9.64e-5]
Steps: 81%|████████ | 403/500 [26:34<04:51, 3.01s/it, loss=0.926, lr=9.64e-5]
Steps: 81%|████████ | 403/500 [26:34<04:51, 3.01s/it, loss=0.953, lr=9.46e-5]
Steps: 81%|████████ | 404/500 [26:36<04:15, 2.67s/it, loss=0.953, lr=9.46e-5]
Steps: 81%|████████ | 404/500 [26:36<04:15, 2.67s/it, loss=1.01, lr=9.28e-5]
Steps: 81%|████████ | 405/500 [26:43<06:40, 4.22s/it, loss=1.01, lr=9.28e-5]
Steps: 81%|████████ | 405/500 [26:43<06:40, 4.22s/it, loss=0.825, lr=9.11e-5]
Steps: 81%|████████ | 406/500 [26:45<05:30, 3.52s/it, loss=0.825, lr=9.11e-5]
Steps: 81%|████████ | 406/500 [26:45<05:30, 3.52s/it, loss=0.909, lr=8.93e-5]
Steps: 81%|████████▏ | 407/500 [26:47<04:41, 3.02s/it, loss=0.909, lr=8.93e-5]
Steps: 81%|████████▏ | 407/500 [26:47<04:41, 3.02s/it, loss=0.781, lr=8.76e-5]
Steps: 82%|████████▏ | 408/500 [26:49<04:06, 2.68s/it, loss=0.781, lr=8.76e-5]
Steps: 82%|████████▏ | 408/500 [26:49<04:06, 2.68s/it, loss=0.862, lr=8.59e-5]
Steps: 82%|████████▏ | 409/500 [26:57<06:22, 4.20s/it, loss=0.862, lr=8.59e-5]
Steps: 82%|████████▏ | 409/500 [26:57<06:22, 4.20s/it, loss=0.822, lr=8.41e-5]
Steps: 82%|████████▏ | 410/500 [26:59<05:15, 3.50s/it, loss=0.822, lr=8.41e-5]
Steps: 82%|████████▏ | 410/500 [26:59<05:15, 3.50s/it, loss=1.01, lr=8.24e-5]
Steps: 82%|████████▏ | 411/500 [27:01<04:28, 3.01s/it, loss=1.01, lr=8.24e-5]
Steps: 82%|████████▏ | 411/500 [27:01<04:28, 3.01s/it, loss=0.829, lr=8.08e-5]
Steps: 82%|████████▏ | 412/500 [27:02<03:55, 2.67s/it, loss=0.829, lr=8.08e-5]
Steps: 82%|████████▏ | 412/500 [27:02<03:55, 2.67s/it, loss=0.92, lr=7.91e-5]
Steps: 83%|████████▎ | 413/500 [27:10<06:02, 4.16s/it, loss=0.92, lr=7.91e-5]
Steps: 83%|████████▎ | 413/500 [27:10<06:02, 4.16s/it, loss=0.789, lr=7.74e-5]
Steps: 83%|████████▎ | 414/500 [27:12<04:59, 3.48s/it, loss=0.789, lr=7.74e-5]
Steps: 83%|████████▎ | 414/500 [27:12<04:59, 3.48s/it, loss=0.979, lr=7.58e-5]
Steps: 83%|████████▎ | 415/500 [27:14<04:14, 3.00s/it, loss=0.979, lr=7.58e-5]
Steps: 83%|████████▎ | 415/500 [27:14<04:14, 3.00s/it, loss=0.963, lr=7.41e-5]
Steps: 83%|████████▎ | 416/500 [27:16<03:43, 2.66s/it, loss=0.963, lr=7.41e-5]
Steps: 83%|████████▎ | 416/500 [27:16<03:43, 2.66s/it, loss=0.903, lr=7.25e-5]
Steps: 83%|████████▎ | 417/500 [27:23<05:42, 4.13s/it, loss=0.903, lr=7.25e-5]
Steps: 83%|████████▎ | 417/500 [27:23<05:42, 4.13s/it, loss=0.803, lr=7.09e-5]
Steps: 84%|████████▎ | 418/500 [27:25<04:42, 3.45s/it, loss=0.803, lr=7.09e-5]
Steps: 84%|████████▎ | 418/500 [27:25<04:42, 3.45s/it, loss=0.924, lr=6.93e-5]
Steps: 84%|████████▍ | 419/500 [27:27<04:01, 2.98s/it, loss=0.924, lr=6.93e-5]
Steps: 84%|████████▍ | 419/500 [27:27<04:01, 2.98s/it, loss=0.769, lr=6.77e-5]
Steps: 84%|████████▍ | 420/500 [27:29<03:31, 2.64s/it, loss=0.769, lr=6.77e-5]
Steps: 84%|████████▍ | 420/500 [27:29<03:31, 2.64s/it, loss=0.99, lr=6.62e-5]
Steps: 84%|████████▍ | 421/500 [27:37<05:28, 4.16s/it, loss=0.99, lr=6.62e-5]
Steps: 84%|████████▍ | 421/500 [27:37<05:28, 4.16s/it, loss=0.778, lr=6.46e-5]
Steps: 84%|████████▍ | 422/500 [27:38<04:31, 3.48s/it, loss=0.778, lr=6.46e-5]
Steps: 84%|████████▍ | 422/500 [27:38<04:31, 3.48s/it, loss=0.994, lr=6.31e-5]
Steps: 85%|████████▍ | 423/500 [27:40<03:50, 3.00s/it, loss=0.994, lr=6.31e-5]
Steps: 85%|████████▍ | 423/500 [27:40<03:50, 3.00s/it, loss=0.845, lr=6.16e-5]
Steps: 85%|████████▍ | 424/500 [27:42<03:22, 2.66s/it, loss=0.845, lr=6.16e-5]
Steps: 85%|████████▍ | 424/500 [27:42<03:22, 2.66s/it, loss=0.944, lr=6.01e-5]
Steps: 85%|████████▌ | 425/500 [27:50<05:13, 4.18s/it, loss=0.944, lr=6.01e-5]
Steps: 85%|████████▌ | 425/500 [27:50<05:13, 4.18s/it, loss=0.771, lr=5.86e-5]
Steps: 85%|████████▌ | 426/500 [27:52<04:17, 3.49s/it, loss=0.771, lr=5.86e-5]
Steps: 85%|████████▌ | 426/500 [27:52<04:17, 3.49s/it, loss=0.932, lr=5.71e-5]
Steps: 85%|████████▌ | 427/500 [27:54<03:39, 3.00s/it, loss=0.932, lr=5.71e-5]
Steps: 85%|████████▌ | 427/500 [27:54<03:39, 3.00s/it, loss=0.771, lr=5.56e-5]
Steps: 86%|████████▌ | 428/500 [27:55<03:11, 2.66s/it, loss=0.771, lr=5.56e-5]
Steps: 86%|████████▌ | 428/500 [27:55<03:11, 2.66s/it, loss=0.861, lr=5.42e-5]
Steps: 86%|████████▌ | 429/500 [28:03<04:57, 4.19s/it, loss=0.861, lr=5.42e-5]
Steps: 86%|████████▌ | 429/500 [28:03<04:57, 4.19s/it, loss=0.836, lr=5.28e-5]
Steps: 86%|████████▌ | 430/500 [28:05<04:04, 3.50s/it, loss=0.836, lr=5.28e-5]
Steps: 86%|████████▌ | 430/500 [28:05<04:04, 3.50s/it, loss=0.99, lr=5.14e-5]
Steps: 86%|████████▌ | 431/500 [28:07<03:27, 3.01s/it, loss=0.99, lr=5.14e-5]
Steps: 86%|████████▌ | 431/500 [28:07<03:27, 3.01s/it, loss=0.804, lr=5e-5]
Steps: 86%|████████▋ | 432/500 [28:09<03:01, 2.67s/it, loss=0.804, lr=5e-5]
Steps: 86%|████████▋ | 432/500 [28:09<03:01, 2.67s/it, loss=0.885, lr=4.86e-5]
Steps: 87%|████████▋ | 433/500 [28:17<04:42, 4.22s/it, loss=0.885, lr=4.86e-5]
Steps: 87%|████████▋ | 433/500 [28:17<04:42, 4.22s/it, loss=1, lr=4.72e-5]
Steps: 87%|████████▋ | 434/500 [28:19<03:51, 3.51s/it, loss=1, lr=4.72e-5]
Steps: 87%|████████▋ | 434/500 [28:19<03:51, 3.51s/it, loss=0.999, lr=4.59e-5]
Steps: 87%|████████▋ | 435/500 [28:20<03:16, 3.02s/it, loss=0.999, lr=4.59e-5]
Steps: 87%|████████▋ | 435/500 [28:20<03:16, 3.02s/it, loss=0.774, lr=4.46e-5]
Steps: 87%|████████▋ | 436/500 [28:22<02:51, 2.68s/it, loss=0.774, lr=4.46e-5]
Steps: 87%|████████▋ | 436/500 [28:22<02:51, 2.68s/it, loss=0.946, lr=4.33e-5]
Steps: 87%|████████▋ | 437/500 [28:30<04:21, 4.15s/it, loss=0.946, lr=4.33e-5]
Steps: 87%|████████▋ | 437/500 [28:30<04:21, 4.15s/it, loss=0.841, lr=4.2e-5]
Steps: 88%|████████▊ | 438/500 [28:32<03:34, 3.46s/it, loss=0.841, lr=4.2e-5]
Steps: 88%|████████▊ | 438/500 [28:32<03:34, 3.46s/it, loss=1.01, lr=4.07e-5]
Steps: 88%|████████▊ | 439/500 [28:34<03:02, 2.99s/it, loss=1.01, lr=4.07e-5]
Steps: 88%|████████▊ | 439/500 [28:34<03:02, 2.99s/it, loss=0.882, lr=3.94e-5]
Steps: 88%|████████▊ | 440/500 [28:36<02:39, 2.65s/it, loss=0.882, lr=3.94e-5]
Steps: 88%|████████▊ | 440/500 [28:36<02:39, 2.65s/it, loss=0.937, lr=3.82e-5]
Steps: 88%|████████▊ | 441/500 [28:44<04:11, 4.26s/it, loss=0.937, lr=3.82e-5]
Steps: 88%|████████▊ | 441/500 [28:44<04:11, 4.26s/it, loss=0.851, lr=3.7e-5]
Steps: 88%|████████▊ | 442/500 [28:45<03:25, 3.54s/it, loss=0.851, lr=3.7e-5]
Steps: 88%|████████▊ | 442/500 [28:45<03:25, 3.54s/it, loss=0.866, lr=3.58e-5]
Steps: 89%|████████▊ | 443/500 [28:47<02:53, 3.04s/it, loss=0.866, lr=3.58e-5]
Steps: 89%|████████▊ | 443/500 [28:47<02:53, 3.04s/it, loss=0.918, lr=3.46e-5]
Steps: 89%|████████▉ | 444/500 [28:49<02:30, 2.69s/it, loss=0.918, lr=3.46e-5]
Steps: 89%|████████▉ | 444/500 [28:49<02:30, 2.69s/it, loss=0.892, lr=3.34e-5]
Steps: 89%|████████▉ | 445/500 [28:57<03:49, 4.18s/it, loss=0.892, lr=3.34e-5]
Steps: 89%|████████▉ | 445/500 [28:57<03:49, 4.18s/it, loss=0.787, lr=3.23e-5]
Steps: 89%|████████▉ | 446/500 [28:59<03:08, 3.49s/it, loss=0.787, lr=3.23e-5]
Steps: 89%|████████▉ | 446/500 [28:59<03:08, 3.49s/it, loss=0.89, lr=3.11e-5]
Steps: 89%|████████▉ | 447/500 [29:01<02:39, 3.00s/it, loss=0.89, lr=3.11e-5]
Steps: 89%|████████▉ | 447/500 [29:01<02:39, 3.00s/it, loss=1.05, lr=3e-5]
Steps: 90%|████████▉ | 448/500 [29:02<02:18, 2.66s/it, loss=1.05, lr=3e-5]
Steps: 90%|████████▉ | 448/500 [29:02<02:18, 2.66s/it, loss=1.01, lr=2.89e-5]
Steps: 90%|████████▉ | 449/500 [29:10<03:31, 4.15s/it, loss=1.01, lr=2.89e-5]
Steps: 90%|████████▉ | 449/500 [29:10<03:31, 4.15s/it, loss=0.801, lr=2.79e-5]
Steps: 90%|█████████ | 450/500 [29:12<02:53, 3.47s/it, loss=0.801, lr=2.79e-5]
Steps: 90%|█████████ | 450/500 [29:12<02:53, 3.47s/it, loss=0.908, lr=2.68e-5]
Steps: 90%|█████████ | 451/500 [29:14<02:26, 2.99s/it, loss=0.908, lr=2.68e-5]
Steps: 90%|█████████ | 451/500 [29:14<02:26, 2.99s/it, loss=0.758, lr=2.58e-5]
Steps: 90%|█████████ | 452/500 [29:16<02:07, 2.65s/it, loss=0.758, lr=2.58e-5]
Steps: 90%|█████████ | 452/500 [29:16<02:07, 2.65s/it, loss=0.872, lr=2.47e-5]
Steps: 91%|█████████ | 453/500 [29:23<03:15, 4.15s/it, loss=0.872, lr=2.47e-5]
Steps: 91%|█████████ | 453/500 [29:23<03:15, 4.15s/it, loss=0.784, lr=2.37e-5]
Steps: 91%|█████████ | 454/500 [29:25<02:39, 3.47s/it, loss=0.784, lr=2.37e-5]
Steps: 91%|█████████ | 454/500 [29:25<02:39, 3.47s/it, loss=0.86, lr=2.28e-5]
Steps: 91%|█████████ | 455/500 [29:27<02:14, 2.99s/it, loss=0.86, lr=2.28e-5]
Steps: 91%|█████████ | 455/500 [29:27<02:14, 2.99s/it, loss=0.839, lr=2.18e-5]
Steps: 91%|█████████ | 456/500 [29:29<01:56, 2.66s/it, loss=0.839, lr=2.18e-5]
Steps: 91%|█████████ | 456/500 [29:29<01:56, 2.66s/it, loss=1, lr=2.09e-5]
Steps: 91%|█████████▏| 457/500 [29:37<02:59, 4.18s/it, loss=1, lr=2.09e-5]
Steps: 91%|█████████▏| 457/500 [29:37<02:59, 4.18s/it, loss=1.05, lr=1.99e-5]
Steps: 92%|█████████▏| 458/500 [29:39<02:26, 3.49s/it, loss=1.05, lr=1.99e-5]
Steps: 92%|█████████▏| 458/500 [29:39<02:26, 3.49s/it, loss=0.987, lr=1.9e-5]
Steps: 92%|█████████▏| 459/500 [29:40<02:03, 3.00s/it, loss=0.987, lr=1.9e-5]
Steps: 92%|█████████▏| 459/500 [29:40<02:03, 3.00s/it, loss=0.831, lr=1.82e-5]
Steps: 92%|█████████▏| 460/500 [29:42<01:46, 2.66s/it, loss=0.831, lr=1.82e-5]
Steps: 92%|█████████▏| 460/500 [29:42<01:46, 2.66s/it, loss=0.996, lr=1.73e-5]
Steps: 92%|█████████▏| 461/500 [29:50<02:42, 4.17s/it, loss=0.996, lr=1.73e-5]
Steps: 92%|█████████▏| 461/500 [29:50<02:42, 4.17s/it, loss=0.805, lr=1.64e-5]
Steps: 92%|█████████▏| 462/500 [29:52<02:12, 3.48s/it, loss=0.805, lr=1.64e-5]
Steps: 92%|█████████▏| 462/500 [29:52<02:12, 3.48s/it, loss=1.05, lr=1.56e-5]
Steps: 93%|█████████▎| 463/500 [29:54<01:50, 3.00s/it, loss=1.05, lr=1.56e-5]
Steps: 93%|█████████▎| 463/500 [29:54<01:50, 3.00s/it, loss=0.837, lr=1.48e-5]
Steps: 93%|█████████▎| 464/500 [29:56<01:35, 2.66s/it, loss=0.837, lr=1.48e-5]
Steps: 93%|█████████▎| 464/500 [29:56<01:35, 2.66s/it, loss=0.87, lr=1.4e-5]
Steps: 93%|█████████▎| 465/500 [30:03<02:24, 4.13s/it, loss=0.87, lr=1.4e-5]
Steps: 93%|█████████▎| 465/500 [30:03<02:24, 4.13s/it, loss=0.973, lr=1.33e-5]
Steps: 93%|█████████▎| 466/500 [30:05<01:57, 3.45s/it, loss=0.973, lr=1.33e-5]
Steps: 93%|█████████▎| 466/500 [30:05<01:57, 3.45s/it, loss=0.984, lr=1.25e-5]
Steps: 93%|█████████▎| 467/500 [30:07<01:38, 2.98s/it, loss=0.984, lr=1.25e-5]
Steps: 93%|█████████▎| 467/500 [30:07<01:38, 2.98s/it, loss=0.84, lr=1.18e-5]
Steps: 94%|█████████▎| 468/500 [30:09<01:24, 2.65s/it, loss=0.84, lr=1.18e-5]
Steps: 94%|█████████▎| 468/500 [30:09<01:24, 2.65s/it, loss=0.917, lr=1.11e-5]
Steps: 94%|█████████▍| 469/500 [30:16<02:08, 4.15s/it, loss=0.917, lr=1.11e-5]
Steps: 94%|█████████▍| 469/500 [30:16<02:08, 4.15s/it, loss=0.838, lr=1.04e-5]
Steps: 94%|█████████▍| 470/500 [30:18<01:43, 3.46s/it, loss=0.838, lr=1.04e-5]
Steps: 94%|█████████▍| 470/500 [30:18<01:43, 3.46s/it, loss=0.949, lr=9.79e-6]
Steps: 94%|█████████▍| 471/500 [30:20<01:26, 2.99s/it, loss=0.949, lr=9.79e-6]
Steps: 94%|█████████▍| 471/500 [30:20<01:26, 2.99s/it, loss=0.809, lr=9.15e-6]
Steps: 94%|█████████▍| 472/500 [30:22<01:14, 2.65s/it, loss=0.809, lr=9.15e-6]
Steps: 94%|█████████▍| 472/500 [30:22<01:14, 2.65s/it, loss=1.05, lr=8.54e-6]
Steps: 95%|█████████▍| 473/500 [30:30<01:52, 4.17s/it, loss=1.05, lr=8.54e-6]
Steps: 95%|█████████▍| 473/500 [30:30<01:52, 4.17s/it, loss=0.781, lr=7.94e-6]
Steps: 95%|█████████▍| 474/500 [30:32<01:30, 3.48s/it, loss=0.781, lr=7.94e-6]
Steps: 95%|█████████▍| 474/500 [30:32<01:30, 3.48s/it, loss=1.02, lr=7.37e-6]
Steps: 95%|█████████▌| 475/500 [30:33<01:14, 3.00s/it, loss=1.02, lr=7.37e-6]
Steps: 95%|█████████▌| 475/500 [30:33<01:14, 3.00s/it, loss=0.77, lr=6.81e-6]
Steps: 95%|█████████▌| 476/500 [30:35<01:03, 2.66s/it, loss=0.77, lr=6.81e-6]
Steps: 95%|█████████▌| 476/500 [30:35<01:03, 2.66s/it, loss=0.964, lr=6.28e-6]
Steps: 95%|█████████▌| 477/500 [30:43<01:35, 4.16s/it, loss=0.964, lr=6.28e-6]
Steps: 95%|█████████▌| 477/500 [30:43<01:35, 4.16s/it, loss=0.813, lr=5.77e-6]
Steps: 96%|█████████▌| 478/500 [30:45<01:16, 3.47s/it, loss=0.813, lr=5.77e-6]
Steps: 96%|█████████▌| 478/500 [30:45<01:16, 3.47s/it, loss=1.06, lr=5.28e-6]
Steps: 96%|█████████▌| 479/500 [30:47<01:02, 3.00s/it, loss=1.06, lr=5.28e-6]
Steps: 96%|█████████▌| 479/500 [30:47<01:02, 3.00s/it, loss=0.959, lr=4.82e-6]
Steps: 96%|█████████▌| 480/500 [30:49<00:53, 2.66s/it, loss=0.959, lr=4.82e-6]
Steps: 96%|█████████▌| 480/500 [30:49<00:53, 2.66s/it, loss=0.859, lr=4.37e-6]
Steps: 96%|█████████▌| 481/500 [30:56<01:19, 4.17s/it, loss=0.859, lr=4.37e-6]
Steps: 96%|█████████▌| 481/500 [30:56<01:19, 4.17s/it, loss=1.03, lr=3.95e-6]
Steps: 96%|█████████▋| 482/500 [30:58<01:02, 3.48s/it, loss=1.03, lr=3.95e-6]
Steps: 96%|█████████▋| 482/500 [30:58<01:02, 3.48s/it, loss=1.07, lr=3.54e-6]
Steps: 97%|█████████▋| 483/500 [31:00<00:50, 3.00s/it, loss=1.07, lr=3.54e-6]
Steps: 97%|█████████▋| 483/500 [31:00<00:50, 3.00s/it, loss=0.791, lr=3.16e-6]
Steps: 97%|█████████▋| 484/500 [31:02<00:42, 2.66s/it, loss=0.791, lr=3.16e-6]
Steps: 97%|█████████▋| 484/500 [31:02<00:42, 2.66s/it, loss=0.934, lr=2.8e-6]
Steps: 97%|█████████▋| 485/500 [31:10<01:02, 4.15s/it, loss=0.934, lr=2.8e-6]
Steps: 97%|█████████▋| 485/500 [31:10<01:02, 4.15s/it, loss=0.777, lr=2.46e-6]
Steps: 97%|█████████▋| 486/500 [31:11<00:48, 3.47s/it, loss=0.777, lr=2.46e-6]
Steps: 97%|█████████▋| 486/500 [31:11<00:48, 3.47s/it, loss=0.957, lr=2.15e-6]
Steps: 97%|█████████▋| 487/500 [31:13<00:38, 2.99s/it, loss=0.957, lr=2.15e-6]
Steps: 97%|█████████▋| 487/500 [31:13<00:38, 2.99s/it, loss=1.04, lr=1.85e-6]
Steps: 98%|█████████▊| 488/500 [31:15<00:31, 2.65s/it, loss=1.04, lr=1.85e-6]
Steps: 98%|█████████▊| 488/500 [31:15<00:31, 2.65s/it, loss=1.01, lr=1.58e-6]
Steps: 98%|█████████▊| 489/500 [31:23<00:45, 4.12s/it, loss=1.01, lr=1.58e-6]
Steps: 98%|█████████▊| 489/500 [31:23<00:45, 4.12s/it, loss=0.779, lr=1.33e-6]
Steps: 98%|█████████▊| 490/500 [31:25<00:34, 3.45s/it, loss=0.779, lr=1.33e-6]
Steps: 98%|█████████▊| 490/500 [31:25<00:34, 3.45s/it, loss=0.955, lr=1.1e-6]
Steps: 98%|█████████▊| 491/500 [31:26<00:26, 2.97s/it, loss=0.955, lr=1.1e-6]
Steps: 98%|█████████▊| 491/500 [31:26<00:26, 2.97s/it, loss=0.839, lr=8.88e-7]
Steps: 98%|█████████▊| 492/500 [31:28<00:21, 2.64s/it, loss=0.839, lr=8.88e-7]
Steps: 98%|█████████▊| 492/500 [31:28<00:21, 2.64s/it, loss=1.06, lr=7.01e-7]
Steps: 99%|█████████▊| 493/500 [31:36<00:29, 4.19s/it, loss=1.06, lr=7.01e-7]
Steps: 99%|█████████▊| 493/500 [31:36<00:29, 4.19s/it, loss=0.975, lr=5.37e-7]
Steps: 99%|█████████▉| 494/500 [31:38<00:20, 3.49s/it, loss=0.975, lr=5.37e-7]
Steps: 99%|█████████▉| 494/500 [31:38<00:20, 3.49s/it, loss=0.987, lr=3.95e-7]
Steps: 99%|█████████▉| 495/500 [31:40<00:15, 3.01s/it, loss=0.987, lr=3.95e-7]
Steps: 99%|█████████▉| 495/500 [31:40<00:15, 3.01s/it, loss=0.783, lr=2.74e-7]
Steps: 99%|█████████▉| 496/500 [31:42<00:10, 2.67s/it, loss=0.783, lr=2.74e-7]
Steps: 99%|█████████▉| 496/500 [31:42<00:10, 2.67s/it, loss=0.941, lr=1.75e-7]
Steps: 99%|█████████▉| 497/500 [31:49<00:12, 4.18s/it, loss=0.941, lr=1.75e-7]
Steps: 99%|█████████▉| 497/500 [31:49<00:12, 4.18s/it, loss=0.78, lr=9.87e-8]
Steps: 100%|█████████▉| 498/500 [31:51<00:06, 3.49s/it, loss=0.78, lr=9.87e-8]
Steps: 100%|█████████▉| 498/500 [31:51<00:06, 3.49s/it, loss=0.928, lr=4.39e-8]
Steps: 100%|█████████▉| 499/500 [31:53<00:03, 3.00s/it, loss=0.928, lr=4.39e-8]
Steps: 100%|█████████▉| 499/500 [31:53<00:03, 3.00s/it, loss=1.03, lr=1.1e-8]
Steps: 100%|██████████| 500/500 [31:55<00:00, 2.66s/it, loss=1.03, lr=1.1e-8]
Steps: 100%|██████████| 500/500 [31:55<00:00, 2.66s/it, loss=0.906, lr=0]
Steps: 100%|██████████| 500/500 [31:59<00:00, 3.84s/it, loss=0.906, lr=0]
---Tar up output directory---
mochi-lora/
mochi-lora/pytorch_lora_weights.safetensors
Uploading to Hugging Face: lucataco/mochi-lora-vhs
HF Repo URL: https://huggingface.co/lucataco/mochi-lora-vhs
pytorch_lora_weights.safetensors: 0%| | 0.00/76.1M [00:00<?, ?B/s]
pytorch_lora_weights.safetensors: 10%|▉ | 7.34M/76.1M [00:00<00:00, 73.4MB/s]
pytorch_lora_weights.safetensors: 21%|██ | 16.0M/76.1M [00:00<00:01, 42.3MB/s]
pytorch_lora_weights.safetensors: 42%|████▏ | 32.0M/76.1M [00:00<00:00, 46.4MB/s]
pytorch_lora_weights.safetensors: 63%|██████▎ | 48.0M/76.1M [00:00<00:00, 54.6MB/s]
pytorch_lora_weights.safetensors: 84%|████████▍ | 64.0M/76.1M [00:01<00:00, 57.3MB/s]
pytorch_lora_weights.safetensors: 100%|██████████| 76.1M/76.1M [00:01<00:00, 54.6MB/s]
Successfully uploaded model to https://huggingface.co/lucataco/mochi-lora-vhs