Skforecast in GPU¶
Traditionally, machine learning algorithms are executed on CPUs (Central Processing Units), which are general-purpose processors that are designed to handle a wide range of tasks. However, CPUs are not optimized for the highly parallelized matrix operations that are required by many machine learning algorithms, which can result in slow training times and limited scalability. GPUs, on the other hand, are designed specifically for parallel processing and can perform thousands of mathematical operations simultaneously, making them ideal for training and deploying large-scale machine learning models.
Two popular machine learning libraries that have implemented GPU acceleration are XGBoost and LightGBM. These libraries are used for building gradient boosting models, which are a type of machine learning algorithm that is highly effective for a wide range of tasks, including forecasting. With GPU acceleration, these libraries can significantly reduce the training time required to build these models and improve their scalability.
Despite the significant advantages offered by GPUs (specifically Nvidia GPUs) in accelerating machine learning computations, access to them is often limited due to high costs or other practical constraints. Fortunatelly, Google Colaboratory (Colab), a free Jupyter notebook environment, allows users to run Python code in the cloud, with access to powerful hardware resources such as GPUs. This makes it an excellent platform for experimenting with machine learning models, especially those that require intensive computations.
The following sections demonstrate how to install and use XGBoost and LightGBM with GPU acceleration to create powerful forecasting models.
  Warning
The following code assumes that the user is executing it in Google Colab with an activated GPU runtime.
XGBoost¶
When creating the model, only two arguments are need to indicate XGBoost to GPU, if it available: tree_method='gpu_hist'
and gpu_id=0
.
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import time
from xgboost import XGBRegressor
from skforecast.ForecasterAutoreg import ForecasterAutoreg
import torch
import os
import sys
import psutil
# Setting device on GPU if available, else CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
# GPU info
if device.type == 'cuda':
print(torch.cuda.get_device_name(0))
print('Memory Usage:')
print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
print('Cached: ', round(torch.cuda.memory_cached(0)/1024**3,1), 'GB')
# CPU info
print(f"CPU RAM Free: {psutil.virtual_memory().available / 1024**3:.2f} GB")
Using device: cpu CPU RAM Free: 6.81 GB
# Data
# ==============================================================================
data = pd.Series(np.random.normal(size=1000000))
# Create and train forecaster with a XGBRegressor using GPU
# ==============================================================================
forecaster = ForecasterAutoreg(
regressor = XGBRegressor(
n_estimators=5000,
tree_method='gpu_hist',
gpu_id=0
),
lags = 20
)
forecaster.fit(y=data)
LightGBM¶
!rm -r /opt/conda/lib/python3.6/site-packages/lightgbm
!git clone --recursive https://github.com/Microsoft/LightGBM
!apt-get install -y -qq libboost-all-dev
%%bash
cd LightGBM
rm -r build
mkdir build
cd build
cmake -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/local/cuda/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda/include/ ..
make -j$(nproc)
!cd LightGBM/python-package/;python3 setup.py install --precompile
!mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd
!rm -r LightGBM
Once all the above installation has been executed, it is necessary to restart the runtime (kernel).
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import time
from lightgbm import LGBMRegressor
from skforecast.ForecasterAutoreg import ForecasterAutoreg
# Data
# ==============================================================================
data = pd.Series(np.random.normal(size=1000000))
# Create and train forecaster with a LGBMRegressor using GPU
# ==============================================================================
forecaster = ForecasterAutoreg(
regressor = LGBMRegressor(n_estimators=5000, device_type='gpu'),
lags = 20
)
forecaster.fit(y=data)
%%html
<style>
.jupyter-wrapper .jp-CodeCell .jp-Cell-inputWrapper .jp-InputPrompt {display: none;}
</style>