1. HuggingFace inference error : Rate limit reached. Please log in or use a HF access token

HfHubHTTPError: 429 Client Error: Too Many Requests for url: https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct (Request ID: qpGM1fqyzXf4Jd-8aG1Dz)

There may be multiple reasons for it.

  • Make sure the access token is setup in environment file HUGGINGFACEHUB_API_TOKEN

  • If access token in use is fine grained, check to make sure it has inference permissions to make calls

  • You have made too many calls to inference endpoint, wait a couple of minutes

2. HuggingFace inference endpoint authorization error

huggingface Authorization header is correct, but the token seems invalid

  • Make sure the access token is setup in environment file HUGGINGFACEHUB_API_TOKEN

  • Go to the HuggingFace cache folder and delete the file named token

  • Restart notebook kernel and try again

  • Apparently HuggingFace blacklist IP addresses in WAF, if this continues to happen then you may need to change IP of your machine https://github.com/huggingface/transformers/issues/21129

  • Try code on Google colab