6. LightGBM

1. LightGBM

LightGBM은 Microsoft에서 개발한 그래디언트 부스팅 프레임워크로, 빠른 학습 속도와 높은 효율성이 특징인 알고리즘이다. 특히 대용량 데이터 처리에 강점이 있다.

장점

매우 빠른 학습 및 예측 속도
적은 메모리 사용량
범주형 변수 자동 처리
병렬 학습 지원
대규모 데이터셋에 효과적

단점

작은 데이터셋에서는 과적합 위험
파라미터 튜닝이 까다로움
데이터가 적을 경우 XGBoost보다 성능이 떨어질 수 있음

2. LightGBM의 주요 특징

리프 중심 트리 분할(Leaf-wise growth): 최대 손실 감소를 가져오는 리프를 찾아 분할
GOSS(Gradient-based One-Side Sampling): 그래디언트가 큰 데이터에 중점을 둔 샘플링
EFB(Exclusive Feature Bundling): 희소 특성을 번들로 묶어 차원 축소
범주형 변수 지원: 범주형 특성을 자동으로 인코딩

3. 코드 실습

import numpy as np
import pandas as pd
import lightgbm as lgb
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

# 데이터 생성
X, y = make_classification(n_samples=1000, n_features=20,
                         n_informative=15, n_redundant=5,
                         random_state=42)

# 학습/테스트 데이터 분할
X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                    test_size=0.2,
                                                    random_state=42)

# LightGBM 모델 생성
lgb_model = lgb.LGBMClassifier(
    n_estimators=100,
    learning_rate=0.1,
    max_depth=5,
    num_leaves=31,
    min_child_samples=20,
    subsample=0.8,
    colsample_bytree=0.8,
    random_state=42
)

# 모델 학습
lgb_model.fit(X_train, y_train)

# 특성 중요도 시각화
plt.figure(figsize=(10, 6))
lgb.plot_importance(lgb_model, max_num_features=10)
plt.title('LightGBM Feature Importance')
plt.tight_layout()
plt.show()

# 예측 및 성능 평가
from sklearn.metrics import accuracy_score, classification_report

y_pred = lgb_model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"모델 정확도: {accuracy:.4f}")
print("\n분류 보고서:")
print(classification_report(y_test, y_pred))

'Machine Learning > Model' 카테고리의 다른 글

7. CatBoost (0)	2024.10.02
5. XGBoost (0)	2024.02.07
4. Random Forest (0)	2024.01.29
3. SVM (0)	2024.01.09
2. Logistic Regression (0)	2023.12.25
1. K-NN (K-Nearest Neighbor) 알고리즘 (0)	2023.12.18

베짱이의 작업공간

6. LightGBM

6. LightGBM

1. LightGBM

2. LightGBM의 주요 특징

3. 코드 실습

'Machine Learning > Model' 카테고리의 다른 글

댓글

티스토리툴바

6. LightGBM

6. LightGBM

1. LightGBM

2. LightGBM의 주요 특징

3. 코드 실습

'Machine Learning > Model' 카테고리의 다른 글

관련글

댓글

티스토리툴바