https://pytorch.org/docs/2.1/generated/torch.bucketize.html#torch-bucketize
right=True,
)
+ self.config.n_special_tokens
)
'''

\n

Is it possible that scaled_context and self.boundaries are not cast on the same device ? Please let me know if there's a fix or if I can help.

\n","updatedAt":"2025-02-28T19:13:28.953Z","author":{"_id":"64cccf391f8a37e3ed70ba36","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png","fullname":"Yvann Vincent","name":"IUseAMouse","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":2}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5446173548698425},"editors":["IUseAMouse"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png"],"reactions":[],"isReport":false}},{"id":"67c45842e4a6decf629d50e1","author":{"_id":"65b507abca3eb329792d2db9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65b507abca3eb329792d2db9/b-1UiR8PRGh7crqVMFeZ_.jpeg","fullname":"Lorenzo Stella","name":"lostella","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":4,"isOwner":false,"isOrgMember":true},"createdAt":"2025-03-02T13:08:18.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"@IUseAMouse if you change \n\n```python\ndata = torch.rand(batch_size, seq_len).to(device)\n```\n\ninto\n\n```python\ndata = torch.rand(batch_size, seq_len)\n```\n\nthen it should be working fine. So, there's no need to put the data in the right device. The reason is: the tokenizer always sits on the CPU, and that's where bucketization happens. The pipeline will take care of moving the quantized data to the right device.","html":"

\n\n@IUseAMouse\n\t if you change

\n
data = torch.rand(batch_size, seq_len).to(device)\n
\n

into

\n
data = torch.rand(batch_size, seq_len)\n
\n

then it should be working fine. So, there's no need to put the data in the right device. The reason is: the tokenizer always sits on the CPU, and that's where bucketization happens. The pipeline will take care of moving the quantized data to the right device.

\n","updatedAt":"2025-03-02T13:08:18.209Z","author":{"_id":"65b507abca3eb329792d2db9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65b507abca3eb329792d2db9/b-1UiR8PRGh7crqVMFeZ_.jpeg","fullname":"Lorenzo Stella","name":"lostella","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":4}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8665317296981812},"editors":["lostella"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/65b507abca3eb329792d2db9/b-1UiR8PRGh7crqVMFeZ_.jpeg"],"reactions":[],"isReport":false}},{"id":"67c57899b74d83160ff8c65a","author":{"_id":"64cccf391f8a37e3ed70ba36","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png","fullname":"Yvann Vincent","name":"IUseAMouse","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":2,"isOwner":false,"isOrgMember":false},"createdAt":"2025-03-03T09:38:33.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"It works like a charm now, thank you for your help","html":"

It works like a charm now, thank you for your help

\n","updatedAt":"2025-03-03T09:38:33.181Z","author":{"_id":"64cccf391f8a37e3ed70ba36","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png","fullname":"Yvann Vincent","name":"IUseAMouse","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":2}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9253564476966858},"editors":["IUseAMouse"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png"],"reactions":[],"relatedEventId":"67c57899b74d83160ff8c65f","isReport":false}},{"id":"67c57899b74d83160ff8c65f","author":{"_id":"64cccf391f8a37e3ed70ba36","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png","fullname":"Yvann Vincent","name":"IUseAMouse","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":2,"isOwner":false,"isOrgMember":false},"createdAt":"2025-03-03T09:38:33.000Z","type":"status-change","data":{"status":"closed"}}],"pinned":false,"locked":false,"collection":"discussions","isPullRequest":false,"isReport":false},"repo":{"name":"amazon/chronos-t5-mini","type":"model"},"activeTab":"discussion","discussionRole":0}">

[ERROR] Chronos Inference on GPU with Torch

#6
by IUseAMouse - opened
https://pytorch.org/docs/2.1/generated/torch.bucketize.html#torch-bucketize
right=True,
)
+ self.config.n_special_tokens
)
'''

\n

Is it possible that scaled_context and self.boundaries are not cast on the same device ? Please let me know if there's a fix or if I can help.

\n","updatedAt":"2025-02-28T19:13:28.953Z","author":{"_id":"64cccf391f8a37e3ed70ba36","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png","fullname":"Yvann Vincent","name":"IUseAMouse","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":2}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5446173548698425},"editors":["IUseAMouse"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png"],"reactions":[],"isReport":false}},{"id":"67c45842e4a6decf629d50e1","author":{"_id":"65b507abca3eb329792d2db9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65b507abca3eb329792d2db9/b-1UiR8PRGh7crqVMFeZ_.jpeg","fullname":"Lorenzo Stella","name":"lostella","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":4,"isOwner":false,"isOrgMember":true},"createdAt":"2025-03-02T13:08:18.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"@IUseAMouse if you change \n\n```python\ndata = torch.rand(batch_size, seq_len).to(device)\n```\n\ninto\n\n```python\ndata = torch.rand(batch_size, seq_len)\n```\n\nthen it should be working fine. So, there's no need to put the data in the right device. The reason is: the tokenizer always sits on the CPU, and that's where bucketization happens. The pipeline will take care of moving the quantized data to the right device.","html":"

\n\n@IUseAMouse\n\t if you change

\n
data = torch.rand(batch_size, seq_len).to(device)\n
\n

into

\n
data = torch.rand(batch_size, seq_len)\n
\n

then it should be working fine. So, there's no need to put the data in the right device. The reason is: the tokenizer always sits on the CPU, and that's where bucketization happens. The pipeline will take care of moving the quantized data to the right device.

\n","updatedAt":"2025-03-02T13:08:18.209Z","author":{"_id":"65b507abca3eb329792d2db9","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65b507abca3eb329792d2db9/b-1UiR8PRGh7crqVMFeZ_.jpeg","fullname":"Lorenzo Stella","name":"lostella","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":4}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8665317296981812},"editors":["lostella"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/65b507abca3eb329792d2db9/b-1UiR8PRGh7crqVMFeZ_.jpeg"],"reactions":[],"isReport":false}},{"id":"67c57899b74d83160ff8c65a","author":{"_id":"64cccf391f8a37e3ed70ba36","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png","fullname":"Yvann Vincent","name":"IUseAMouse","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":2,"isOwner":false,"isOrgMember":false},"createdAt":"2025-03-03T09:38:33.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"It works like a charm now, thank you for your help","html":"

It works like a charm now, thank you for your help

\n","updatedAt":"2025-03-03T09:38:33.181Z","author":{"_id":"64cccf391f8a37e3ed70ba36","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png","fullname":"Yvann Vincent","name":"IUseAMouse","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":2}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9253564476966858},"editors":["IUseAMouse"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png"],"reactions":[],"relatedEventId":"67c57899b74d83160ff8c65f","isReport":false}},{"id":"67c57899b74d83160ff8c65f","author":{"_id":"64cccf391f8a37e3ed70ba36","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64cccf391f8a37e3ed70ba36/NVDcsKt--EVmPPXZrZKri.png","fullname":"Yvann Vincent","name":"IUseAMouse","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":2,"isOwner":false,"isOrgMember":false},"createdAt":"2025-03-03T09:38:33.000Z","type":"status-change","data":{"status":"closed"}}],"pinned":false,"locked":false,"collection":"discussions","isPullRequest":false,"isReport":false},"primaryEmailConfirmed":false,"repo":{"name":"amazon/chronos-t5-mini","type":"model"},"discussionRole":0,"acceptLanguages":["en","*"],"disableDiscussionClosingAndCommentHiding":false,"hideComments":true}">

Hello,

Chronos works just fine on CPU, but I run into an error trying to run it on GPU. When passing a torch tensor on cuda, I get the following error :

'''
File "/srv/home/yvincent/mlhive/projects/timeseries/src/forecast.py", line 80, in forecast
forecast = self.pipeline.predict(data, self.prediction_length)
File "/srv/home/yvincent/miniconda3/envs/onnx/lib/python3.13/site-packages/chronos/chronos.py", line 507, in predict
token_ids, attention_mask, scale = self.tokenizer.context_input_transform(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
context_tensor
^^^^^^^^^^^^^^
)
^
File "/srv/home/yvincent/miniconda3/envs/onnx/lib/python3.13/site-packages/chronos/chronos.py", line 224, in context_input_transform
token_ids, attention_mask, scale = self._input_transform(context=context)
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/srv/home/yvincent/miniconda3/envs/onnx/lib/python3.13/site-packages/chronos/chronos.py", line 189, in _input_transform
torch.bucketize(
~~~~~~~~~~~~~~~^
input=scaled_context,
^^^^^^^^^^^^^^^^^^^^^
...<3 lines>...
right=True,
^^^^^^^^^^^
)
'''

How to reproduce the error :
'''
device = 'cuda'
pipeline = ChronosPipeline.from_pretrained(
'amazon/chronos-t5-mini',
device_map=device,
torch_dtype=getattr(torch, 'bfloat16')
)
batch_size = 4
seq_len = 500
pred_len = 64
data = torch.rand(batch_size, seq_len).to(device)

forecast = self.pipeline.predict(data, pred_len)
'''

This code works perfectly well if device='cpu'. I am trying to run this on a V100 32GB. Note that I run into this error with any chronos-t5 model (mini, tiny, small and large).

The line that throws the error in the lib is line 189 in src/chronos/chronos.py :
'''
token_ids = (
torch.bucketize(
input=scaled_context,
boundaries=self.boundaries,
# buckets are open to the right, see:
# https://pytorch.org/docs/2.1/generated/torch.bucketize.html#torch-bucketize
right=True,
)
+ self.config.n_special_tokens
)
'''

Is it possible that scaled_context and self.boundaries are not cast on the same device ? Please let me know if there's a fix or if I can help.

Amazon org

@IUseAMouse if you change

data = torch.rand(batch_size, seq_len).to(device)

into

data = torch.rand(batch_size, seq_len)

then it should be working fine. So, there's no need to put the data in the right device. The reason is: the tokenizer always sits on the CPU, and that's where bucketization happens. The pipeline will take care of moving the quantized data to the right device.

It works like a charm now, thank you for your help

IUseAMouse changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment