Skforecast in GPU¶
Traditionally, machine learning algorithms are executed on CPUs (Central Processing Units), which are general-purpose processors that are designed to handle a wide range of tasks. However, CPUs are not optimized for the highly parallelized matrix operations that are required by many machine learning algorithms, which can result in slow training times and limited scalability. GPUs, on the other hand, are designed specifically for parallel processing and can perform thousands of mathematical operations simultaneously, making them ideal for training and deploying large-scale machine learning models.
Three popular machine learning libraries that have implemented GPU acceleration are XGBoost, LightGBM and CatBoost. These libraries are used for building gradient boosting models, which are a type of machine learning algorithm that is highly effective for a wide range of tasks, including forecasting. With GPU acceleration, these libraries can significantly reduce the training time required to build these models and improve their scalability.
Despite the significant advantages offered by GPUs (specifically Nvidia GPUs) in accelerating machine learning computations, access to them is often limited due to high costs or other practical constraints. Fortunatelly, Google Colaboratory (Colab), a free Jupyter notebook environment, allows users to run Python code in the cloud, with access to powerful hardware resources such as GPUs. This makes it an excellent platform for experimenting with machine learning models, especially those that require intensive computations.
The following sections demonstrate how to install and use XGBoost and LightGBM with GPU acceleration to create powerful forecasting models.
Note
The following code assumes that the user is executing it in Google Colab with an activated GPU runtime.
XGBoost¶
Version >= 2.0¶
When creating the model with XGBoost version >= 2.0, two arguments are need to indicate XGBoost to run in GPU, if it available: device='cuda'
and tree_method='hist'
.
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import time
from xgboost import XGBRegressor
from skforecast.ForecasterAutoreg import ForecasterAutoreg
import torch
import os
import sys
import psutil
# Print information abput the GPU and CPU
# ==============================================================================
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
if device.type == 'cuda':
print(torch.cuda.get_device_name(0))
print('Memory Usage:')
print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
print('Cached: ', round(torch.cuda.memory_cached(0)/1024**3,1), 'GB')
print(f"CPU RAM Free: {psutil.virtual_memory().available / 1024**3:.2f} GB")
# Data
# ==============================================================================
data = pd.Series(np.random.normal(size=1000000))
# Create and train forecaster with a XGBRegressor using GPU
# ==============================================================================
forecaster = ForecasterAutoreg(
regressor = XGBRegressor(
n_estimators=5000,
tree_method='hist',
device='cuda'
),
lags = 20
)
forecaster.fit(y=data)
Version < 2.0¶
When creating the model with XGBoost version < 2.0, two arguments are need to indicate XGBoost to run in GPU, if it available: tree_method='gpu_hist'
and gpu_id=0
.
# Create and train forecaster with a XGBRegressor using GPU
# ==============================================================================
forecaster = ForecasterAutoreg(
regressor = XGBRegressor(
n_estimators=5000,
tree_method='gpu_hist',
gpu_id=0
),
lags = 20
)
forecaster.fit(y=data)
LightGBM¶
!rm -r /opt/conda/lib/python3.6/site-packages/lightgbm
!git clone --recursive https://github.com/Microsoft/LightGBM
!apt-get install -y -qq libboost-all-dev
%%bash
cd LightGBM
rm -r build
mkdir build
cd build
cmake -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/local/cuda/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda/include/ ..
make -j$(nproc)
!cd LightGBM/python-package/;python3 setup.py install --precompile
!mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd
!rm -r LightGBM
Once all the above installation has been executed, it is necessary to restart the runtime (kernel).
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import time
from lightgbm import LGBMRegressor
from skforecast.ForecasterAutoreg import ForecasterAutoreg
# Data
# ==============================================================================
data = pd.Series(np.random.normal(size=1000000))
# Create and train forecaster with a LGBMRegressor using GPU
# ==============================================================================
forecaster = ForecasterAutoreg(
regressor = LGBMRegressor(n_estimators=5000, device_type='gpu'),
lags = 20
)
forecaster.fit(y=data)